1/17 



SEQ 3 
SBO 6 
SEQ 8 
SEQ 10 
SEQ 12 
SEQ 14 
SEQ 16 
SEQ 19 
SEQ 22 
SEQ 24 
SEQ 27 
SEQ 3 0 
SEQ 33 
SEQ 3 5 
SEQ 3 8 
SEQ 40 
SEQ 42 
SEQ 44 
SEQ 83 
SEQ 8 5 
Bacteria 

T44612 

NP_625402 

NP_295913 

AF320254 

OYE family 

Af4875 

Af4961 

Ca2460 

Nc4452 

SC0YE1 

SC0YB2 

SC0YE3 

A36990 



MTV AD 

- -MSQPWPD 
--MGSNAFRS 

MALPD 

MTVPYQVKPS 

- -MADFTQKK 



MTG 

-MAY EI 



IDVPPAEGIP YFTPAQNPPA GTAANPQTN- --GQKIPKLF TPLTIR-GVT : 
IENKPAPGIS YFTPAQEPPA GTAANPQSDG - - -SAPPKLF RPLSVR-GLT I 

PAVTKSSSTP YYTPANNGGA ALHPDDPT- TPTLF RPLQIR-NVT 1 

VENTPAAGIP YFTPAQNPPA GTAANPQTSG ---NAVPKLY TPLTVR-GVT 1 

DEIKGAPEVS YYTPEQPVPA GTFYPQSSD- EVAPK1 F QPLKIG - KLA ! 

- - MENN NTIPALF QPIKISDSIT ] 

TSSPAAPGVP FYT PAQVPAA GTPLPSTPG DVPTLF TPLKIR-GVE I 

- --- -- - MATST TSDLKLS QPLTLPNGLT I 

TLSKPAAGVP YYTPAQEPPA GTPLQQQDA IPTLF KPLKIR-GVE ! 

IVNEGAENVG YFTPAQKI PA GAAIG-VP- - QTXLF TPLKIR-GVE 1 

TANKAAPGVP FYTPAQEPPA GTPVDASTA- PTLF KPLRIR-DLT 

IDNVAAEGVP YYTPAQDPPA GTQTSG STKLF TPITIR-GVT 1 



-MP KCEANGHHKI I INKEAPNVP FYTPVQDPPA GTSYDVQPEG 
ARGI IDNIAAEGAP YYTPAQD . PA GTQTSG ST - - 



-SLF SLIKIR-NLT LQ- 
-KVF T.ITIR-GVT FP- 
--LKIR-GLT LQ- 



NRLGLAPLCQ 
NRIGLSPLCQ 
NRIMVSPMCM 
NRLGLAPLCQ 
KRIGVSPMCQ 
NRIGVSPMCM 
NRFAVAPMCT 
NRLVKAAMAE 
NRFGVSPMCT 
NRMFVSPMCT 
NRIWVSPMCQ 
NRLFLAPLCQ 

NRIFVSPMCQ 
NRLFLAPLCQ 
NRIMLRGLCQ 



-MSPPRFEAA PADPSPLG-- TPLKY PVSGR- - SAP NRFLNAAMSE 

MTVQSQQQSQ AIPVLSSQNG TEPQDANKEV VQNVAAKGVQ YFNPEQLPAP GLGINGPNNT LPKVF TPIKIR-GMT MP-- NRIWVSPMCQ 

MDTS RFVSGLTPPL VDSIDALKIS NFVPTRSGHP PPGSVPESIL PBGVKKPALF QTLTLP-FAA PEQAGKMTFK NRIIVSPMCQ 



- -- MSALF EPYTLK - DVT LR- 

- --- -- MSALF EPFRliR-DTT IP- 

MTVSSAA APQPASPAA- PLLP TPLKLR-SLE LP- 

-MYSMLT RSQRISHENL RLRDAGWLEG YERWLARKAG MTVRDDETP PPPMF TPFKLR-GLT LA- 



NRIAIPPMCQ 
NRIWMPPMCQ 
NRVWSPMCT 
NRIVMSPMAM 



MREEPSSAQ LF KPLKVG--RC HLQ- - 

MTI RKLDGEESM- LF QPLEIA-NGR IRLS- 

-MTVESTNS FWPAGTKQI EIAPLGSTK LF QPIKVG-KNI LP 

MAATAAESR- LF QPLKLTPKIT LG 

MS FVKD F KPQALGDTN- LF KPIKIG-NNE LL- - - 

-MP FVKD F KPQALGDTN LF KPIKIG-NNE LL- - - 

MP FVKG F EPISLRDTN- LF EPIKIG-NTQ LA- - - 

-MTIESTNS FWPSDTKLI DVTPLGSTK LF QPIKVG-NNV LP--- 



141 



151 



161 



171 



181 



HRMIMAPTTR 
HRWHAPMTR 
HRVAHAPTTR 
HRLAMAPLTR 
HRAVI PPLTR 
HRAVI PPLTR 
HRAVMPPLTR 
QR I A YVPTTR 

191 



SEQ 3 
SEQ 6 
SEQ 8 
SEQ 10 
SEQ 12 
SEQ 14 
SEQ 16 
SEQ 19 
SEQ 22 
SEQ 24 
SEQ 27 
SEQ 3 0 
SEO 33 
SEQ 35 
SEQ 38 
SEQ 40 
SEQ 42 
SEQ 44 
SEQ 83 
SEQ 35 
Bacteria 
T44612 
NP_625402 
NP_295913 
AF3202S4 
OYE family 
Af4875 
Af4961 
Ca2460 
Nc4452 
ScOYEl 
SC0YE2 
SCOYE3 
A36990 



YSA QDGHM TD--YHIAHL 

YSA DDGHM TP - -WHMAHL 

YSCE S DPSSPHVGAL TN--YHLAHL 

YSA EDGHM TD--YHIAHL 

YSA DYNFEA TP- -YHLIHY 

YSS SPTDNQA TL--FHFVHY 

YSA -DDGHM TD--WHLVHL 

QMG -- FGNHL PN--PELAAV 

YSA DDGHL TD- -FHLVHL 

YSA DQEGHL TD- -FHLVHL 

YSA DNGHA TD--YHLVHL 

YSA KDGYA TD--WHLTHL 



GGIAQRGPGL 
GGIAQRGPGP 
G HLALKG AG L 
GGIAQRGPGL 
GSLVNRGPGI 
GSFAVRGPAL 
GSFALRGVPL 
YATWARG DWG 
GQFALHGTAL 
GAMGMRGPGL 
GQFALHGAAL 
GGIIQRGPGL 



MLIEATAVQP 
LMVBATAVEP 
VFIEATAVQP 
MMIEATSVSP 
TIVESTAVSP 
IILESIFVSE 
TIFEATGVLP 
LI LTGNVQVD 
TIVEATSVTP 
VMVEATAVSP 
SMVEATAVEA 
SMVEATAVQN 



E-GRITPQDV 
E-GRITPQDL 
N-GRISPNDS 
E-GRITPQDV 
E-GGLSPHDL 
N-SGLSIHDL 
N-GRITPECS 
HAHKGDAHDI 
N-GRISPEDS 
E-GRISPNDS 
R-GRISPEDV 
H-GRITPQDV 



-GLWK--DS 
-GLWK--DS 
-GLWQ--DG 
-GLWK- -DS 
-GIWK--DE 
-GLWN--DD 
-GLWQ- -DS 
-SPNH--PG 
-GLWQ- -DS 
-GLWFTMES 
-GLWQ- -DS 
-GLWE--DG 



QIAPMR 

QIEPLS 

TTSEQFLGLK 

OIAPMK 

QAEKLK 

QAHSLR 

QIAPLK 

TTPEQTVTAF 

QIAPLR 

QMKPLR 

QIAPLK 

QIEPLK 



RVI-DFVHSQ 
RVI-EFVHSQ 
RW-EFMHAQ 
RVI-DFVHSQ 
PIV-DYAHSQ 
KIV-DFIHDQ 
RIV-DYIHSQ 
KAWADAARLN 
RIV-DYVHSQ 
RIV-EFAHSQ 
RIV-DFIHSQ 
RIT-TFAHSQ 



GQ-KIGV- -Q 
NQ-LIGV--Q 
GA-KVGI--Q 
SQ-KIGV- -Q 
KQ-LIAI--Q 
DG-ICCI--Q 
GQ-KAGI--Q 
GQSKTPVWQ 
GQ-KIAI--Q 
NQ-KIGI--Q 
NQ-VAAI--Q 
SQ-KIGI--Q 



YSA --KDGVM TP- -WHKQHL GSFAARGPGL IVTBVNAVSP B-GRISPEDA 

YSA -KDGYA TD--WHLTHL GGIIQRGPGL SMVEATAVQN H-GRITPQDV 

YSA PDGHY TM--WHHTHM GGIIQRGPGL TCVEATAVTP Q-GRITPEDV 



-GIYD--DG 
-GLWE--DG 
-GIWQ--DS 



QLGPLR DIV-DFVHSQ GA-KIAI--Q 

QIEPLK RIT-TFAHSQ SQ-KIGI--Q 

QIEPLA KW-EFAHSQ NQ-KIMI--Q 



GLA TF DEADP-SKRG I PTEQLVQLY RRWGQGEWGQ IQTGNVMIDP EHLEAPGNMV 

YSA RDGFQ QP- -WHFAHY GGIAQRGPGL IMLEATAVQA R-GRITPEDS 

YSA NNGLP TP- - YHIAHL GSFALHGVGN VMVEASGVEP E-GRITPQDL 



YMA EDGLI ND--WHQVHY ASMARGGAGL LWEATAVAP E-GRITPGCA 

YSA APEGPSAGVP GD- -WHFAHY GARAVGGTGL IWEATGVSP E-GRISPQDL 

YSA TDGVA NE- -FHLVHL GQYALGGAGL I LAEATAVS P E-GRITPEDL 

YSA EDGAP TD--FHLVHF GSRALGGAGL LYTEMTCVSP D-ARITPGCA 



FRA DGQG VPLPFVQEYY GQRASVPGTL LITEATDITP K-AMGYKHVP 

NRG VP LN PT S TPEQPNRIWY PG-DLMVQYY RQRAT-PGGL IISEGVPPSL E-SNGMPGVP 

FRA AKNHT PS -DLQLEYY KTHSQYPGTL I ITEATFTSE Q-GGMDLHVP 

FRS- -DDE-HV PIVPLMTTYY SQRASVPGTL LVTEATFISP A - AGGYDNVP 

MRA LHPGNI PNRDWAVEYY TQRAQRPGTM IITEGAFISP Q-AGGYDNAP 

MRA QHPGNI PNRDWAVEYY AQRAQRPGTL IITEGTFPSP Q-SGGYDNAP 

MRA THPGNI PNKEWAAVYY GQRAQRPGTM IITEGTFISP Q-AGGYDNAP 

FRA SKD-HI PS-DLQLNYY NARSQYPGTL IITEATFASE R-GGIDLHVP 



- -VPRD- 


AEP 


•-GIWL- 


-DS 


-GIWS- 


-EQ 


-GIWS- 


-DA 


--GLWN- 


-DT 


■ -GLWD- 


-DR 


• -GMYK- 


-PE 


-GIWS- 


-EP 


• -GLWT- 


-PE 


• -GIYN- 


-DA 


--GIYN- 


-AA 


■-GVWS- 


-EE 


-GIWS- 


-EE 


-GIWS- 


-DE 


•-GIYN- 


-DA 



- SGERFDMFS KLAAAAKEHG SLIV-A Q 

HVEGLR KHV-EFAHAN NS-LIGI--Q 

HRDAHK ALV-SVLKSF TD-GLGVGLQ 



-HAQAFV PW-QAIKAA GS-VPGI--Q 
-QVEAFR RIT-GFLRSQ GT-VPAV--Q 
-QIVPLG HIT-DFVHQH GG-HIGV--Q 
-HVNAWK RIV-DFVHGN SDAKIGM--Q 



-QREAWR 
-QAAGWK 
-QTKAWK 
-QIAAWK 
-QMVEWT 
-QIKEWT 
-QVAEWK 
-QAKSWK 



EIV- 
RW- 
KIN- 
KIT- 
KIF- 
KIF- 
NIF- 
KIN • 



SRVHSK KC-FIFC- -Q 
DAVHEQ GG-YIYC- -Q 
DEI HAN GS-FSSM- -Q 
DAVHAK GS-FIFC--Q 
NAIHEK KS-FVWV--Q 
KAIHEN KS-FAWV--Q 
LAIHDC QS-FAWV--Q 
EAIHGN GS-FSSV--Q 
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SEQ 3 
SEQ 6 
SEO e 
SEQ 10 
SEO 12 
SEQ 14 
SEO 16 
SEO 19 
SEQ 22 
SEO 24 
SEQ 27 
SEQ 3 0 
SEQ 3 3 
SEQ 3 5 
SEQ 3 8 
SEQ 4 0 
SEQ 42 
SEQ 44 
SEQ 83 
SEQ 85 
Bacteria 
T44612 
NP_625402 
NP_29S913 
AF320254 
OYE family 
Af4875 
Af4961 
Ca2460 
NC4452 
ScOYEl 
SCOYE2 
ScOYE3 
A36990 



LAHAGRKATT VAPW 

IAHAGRKAST VAPW- 

LAHAGRKASA VAPW 

IAHAGRKASN IAPW 

LGHGGRKASG QPLF 

LNHAGRKIVE GVPF 

IAHAGRKAST KAPW- 

INHPGRQSPM GAGT- 

LAHAGRKAST KAPWHDSPTP 

LAHAGRKAST TAPY 

LAHAGRKAST IAPW- 

LSHAGRKASC VSPW 

IGHAGRKAST WPW 

LSHAGRKASC VSPW - 

LAHAGRKAST VAPW 

VGHPGRQARG SVQ 

1GHAGRKASC VAPW 

LAHAGRKASD WSPF 

IAHAGRKASA NRPW 

LAHAGRKAST AQPW 

LAHAGRKAST YAPW 

LGHAGRKGAT KLAW 



ISFS A I ATEKVGG W 

LSAN DTASEKMGGW 

- LAAQAGKSS LKADESVGGW 

LMNKG IVATEKVGGW 

LHLE QVADKSVNGF 

QQIQHGW 

HYQRGKS ELAGPEQGGW 

RGLW 

SGEYKPREGL QWGPEYGGW 

RG-Y TVATEAQGGW 

ITEARGK ALAQESENGW 

LSVN AVAAEEVGGW 



PDRVKGPGDI P - FAEPFAKP 

PGRVKGPTNV P FTVKNPVP 

PADWGPSGG E EHIP SPEEDAYWVP 

PDRVIGPSTV P FHETFPTP 

ADKAVAPSAL A FRPNGNLP 

OEHCVGPSTE P-- - --FSDSHNTP 

PENVWAPSAI S -YNEETFPFP 

E-KAVAPSPV P LVLGEAFVP 

PDDVWAPSA1 P--- - --FSEDFPNP 

ENDVYGPFTK E DRWDEKHAQP 

PDDWAPSAI P YTKDWATP 

PDNIVAPSAI A -QENGVNPVP 



KA KTLDEIE QFKK-DWVAA 

KB -MTKQDIB DLKT-AWVAA 

RA --- LSTAEVR QWA - AFAKS 

KA - - -MTKDDIE QFKR-DWFDA 

VPNE LTKDEIK RWK-DFGAA 

RE LTVNEIN SIVE-DFANA 

KB MTVEQ1H ELVE-AWKAS 

RLLSKVLFGT PRELTVAEIK DIV-QKFAVT 

KB MTVEEIE GLVT-SFVDA 

HK LTEKQYD ELVD-KFWA 

RE LTTE.SR VWVK-KFAES 

KA FTKEDIE QLKS-DYVEA 



LWATGRAADP 
LWHAGRATIP 
LWYLGRVANP 
LWSLGRAANP 
LWVLGWAAFP 
LWVLGWAAFP 
LWSLGWASFP 
LWYLGRVANA 



DVLA- 
QMTG- 
KDLK- 
EVLA- 
DNLA- 
DTLA- 
DVLA- 
KDLK- 



LDRK NTAF7- 

LSIN AVAAKEVGGW PDNIVAPSAI A -QEAGVNPVP KA- - 

LSGG DVAGEDVNGW PQDVWAPSAI P WNEKHAVP KE-- 

LPS KRAGKEAGGW PEDWGPSGG EDFTWDERSS SDPSGGYYAP RE — 

-QHP1SASD VQLKQEM FGSKFGVP RP-- 

LDAG LAAEKAAGGW PDDWGPSNE P - - - FAPGYPTP RA-- 

--YRGEKKQ KFVTQEEGGW PDRWAPSA1 A --YAQGHVTP RA 

EGDD H I GADDARG W --ETIAPSAI A - - FGAHLPNV PRA- 

RGG APVGADAYGW --QPLAPSAL A - FTJERHPVP TE 

- RGK GAVPAELGGW - -QVIGPDEN S--- FHDLFPTP AM 

EG IDEPLEAGAW - - ELI S AS PL P YLPHSQVP RA- - 

DMK--D LISSS-AVPV EEKGP LP RA 

- -SPAVSAS ATVWDSPTEC YSHPP VGST EPVRYADHPP IE-- 

DAG LP L IGPSA- -VYW DEESE KLAKSVGNEL RE 

KEGGLK LKSSS-AVPM EEGAP - VP EE-- 

RDG-LR YDS AS DNVFM DAEQE AKAKKANNPQ HS 

RDG-LR YDS AS DNVYM NAEQE EKAKKANNPQ HS 

RDG-LR YDCAS DR VYM MATLQ EKAKDANNLB HS 

DSG-LP LIAPS-AVYW DENSE - KLAKEAGNEL RA 



-FTKEDIE BLKN-DFLAA 
-MSLDDIE AFKK-AFGEA 
-LSVREIK EMVQ-DWATA 
-ATKEDIK AVIE-GFAHT 
-ITLEEIE QLKE-DFVSG 
-LTTEDIN KLQD - KFVQS 

-MTLDDIA RVKQ-DFVDA 
-LTVPQIQ EAVG-RFADA 
-MGADELR GWD-AFSAA 
-MTRDDME RVRN-DFVRA 



-LTEDEIQ 
-LTIP-HL 
-LTEKEID 
-MTVAEIK 
-LTKDEIK 
-ITKDEIK 
-LTKDDIK 
-LTEEEID 



QCIA-DFAQA 
KQT I RDYCNA 
HIVEVEYPNA 
ERVA-EYAAA 
QYIK-EYVQA 
QYVK-EYVQA 
QYIK-DYIHA 
HIVEVEYPNA 



301 



311 



331 



341 



351 



361 



371 



391 



SEQ 3 
SEQ 6 
SEQ 8 
SEQ 10 
SEO 12 
SEO 14 
SEQ 16 
SEQ 19 
SEQ 22 
SEQ 24 
SEQ 27 
SEQ 3 0 
SEQ 33 
SEQ 35 
SEQ 3 8 
SEO 40 
SEO 42 
SEQ 44 
SEQ 83 
SEQ 85 
Bacteria 
T44612 
NP_625402 
NP_295913 
AF320254 
OYE family 
Af4875 
Af4961 
Ca2460 
NC4452 
ScOYEl 
ScOY E2 
ScOYE3 
A36990 



TKRAIAA-GA 
VKRAVKA-GA 
ARLAVQA-GV 
CKRAIAA-GA 
ARRAVEISGF 
AWRAVEISKF 
AQRALKA-GF 
ARITAEA-GF 
AKRAIEA-GV 
AKRAVEI-GF 
AKRSNRA-GF 
AKRAIHA-GF 



DFVEIHNAHG 
DFIEIHNAHG 
DVIEI HGAHG 
DFIEIHNAHG 
DAVEI HGAHG 
DAI EI HCANG 
DLIBIHAAHG 
NGVEI HAAHG 
DI I EI HGAHG 
DVIEI HGAHG 
DVIEIHAA- - 
DVIEI HAAHG 



YLLSSFLSP- 
YLLMSFLSP- 
YLINBFLSP- 
YLLSSFLSP- 
YLINBFYSP- 
CLIHQFLSK- 
YLISEFLSP- 
YLLAQFLSK- 
YLITEFLSP- 
YLISSTVSPA 



-AANNRTDQY 
-AVNTRTDEY 
-VTNKRTDAY 
-SSNTRTDEY 
- 1 SNKRTDEY 
-LTNKRADQY 
- 1 SNQRTDQY 
-KTNRRGDEY 
-LSNKRTDKY 
FTTNDRNDKY 



G-GSFENRIR 
G-GSFENRIR 
G-GSFENRTR 
G-GSFENRIR 
G-GSFENRTR 
G-GSFENRVR 
G-GSFENRTR 
G-GSAENRAR 
G-GSFENRTR 
G-GTFEKRIL 



LSLEIAQLTR 
LSLEIAKLTR 
I VREVAAAIR 
LSLEIAQVTR 
FLKEVIDSVK 
FLLQIIENIK 
VLREIISAVR 
IVGEIIKECR 
VLIDIIKAVR 
FPMEWHSVR 



DAVGPHVP VFLR 

ENVPKDMP VFLR 

AVIPEGMP-- LFLR 

DAVGPNVP VFLR 

SSIPNDVP- VFLR 

RKIET--P IFLK 

SVIPEDMP LFVR 

RQVTEAVGEE EAKKFWGIK 

AVIPEEM PLFVR 

KAIPDSMP- - LFYR 



ISAS 
VSAT- 
ISAT- 
VSAT- 
ISAA- 
FPMS- 
VSAT- 
LNSA- 
ISAT- 
VTAT- 



DWCE- 
DWLE- 
EWLE- 
DWIE- 
ENSP- 
DNCS- 
EWME- 
DWQA- 



ETLPEQ 

EVQPNKP 

-GQPVAAESG 

ETLPBB 

DPB 

DPE 

YTGQP- 

-GRDGKEEBE 

YAGEP 

KG0-- 



YLIJ1QFLSP- -VSNQRTDEY - 

TDEY G-GSFENRIR WLEILDLIR AAIPETTP- 



-VLVR VSAT-DWFEP DSQFKDEFPE 



. KRA . RA-GF DVIEI HAAHG Y.LHQFLSP- 

VKRALKA-GP DVIEIHNAHG YLLHEFICL- 

AKRAVKA-GV DVIEI HGAHG YLIHEFLSP- 

AEYLEKA-GF DGI ELHAAHG YLLAQFLSE- 

VRRAVEA-GF DTIDFHFAHG YLVSSFLSP- 

ARWAFEA-GY DYVELHSAHG YLMHSFLSP- 

ARRARDA-GF EWIELHFAHG YLGQSFFSE- 
ARRALAA-GF EI AE I HGAHG YLI HEFLSP- 
ARRAOVA-GF DAVEVHAAHG YLLHQFLSP- 
TRMAAEA-GF DILELHCAHG YLLSSFLSP- 



- VSNQRTDEY G-GSFENRIR WLEII 

-RATPGPTST G-GSWENRTR LTMESRRPCP QH? 

-ITNRRTDSY G-GSFENRTR LLIEIVTAVR AAMPSSMP- 

-TTNQRTDEY G-GSLENRMR LILEVTAEVR RRTSKNF 

-ATNKRTDKY G-GSFENRVR LALEIVEAAR AVMPEDMP- 
-LTNQRTDEY G-GSLENRAR FLLNVARRIR QEFPNKG 

- HSNKRTDAY G-GSFDNRSR FLLETLAAVR EVWPENLP- 
-HSNQRTDAY G-GSYANRTR FALEWDAVR EVWPDDKP- 
-LANTRTDDY G-GSFENRTR LLLEWRAVR HVWPAHLP- 
-LTNRRTDEF G-GDLENRAR FPLEVFKAMR AMWPTNRP- • 



- -LFLR LSST-EWME- 
-ILGIK INSV-EFOE- 
--LFTR ISGT-DWLE- 
--LWVR VSST-DWAD- 

--LTAR FGVL-EYDG- 
--LFFR VSAT -DWLE - 
- -LFVR LSAT-DWAE- 
--MSVR LSCH-DWFP- 



-DTDIGKKFG 



--NNPBYEGE 
QAHQAD 



ARNAINA- 
AKTAMEI- 
AKRAIEA 
AKNAVEA- 
AKNSIAA- 
AKNSIAA- 
AKNSIAA- 
AKHALEA 



GF DGVEIHGANG 
GF DGVELHAGNG 
GF DYIEVHSAPG 
GF DGVEIHGANG 
GA DGVEIHSANG 
GA DGVEIHSANG 
GA DGVEIHSANG 
GF DYVEI HGAHG 



YLIDQFTQK- 
YLPEQFLSS - 
YFLDQFLNP- 
YLIDOFLQD- 
YLLNOFLDP- 
YLLNOFLDP- 
YLLNQFLDP- 
YLLDOFLNL- 



-SCNHRQDRW 
-NVNKRTDEY 
-ASNKRTDKY 
-TCNQRTDEY 
- HSNTRTDEY 
- HSNNRTDEY 
-HSNKRTDEY 
-ASNKRTDKY 



G-GSIENRAR 
G-GSPEKRCR 
G-GSIENRAR 
G-GSIENRSR 
G-GSIENRAR 
G-GSIENRAR 
G-GTIENRAR 
GCGSIENRAR 



FAVEVTRAVI 
FVLELMDELA 
LLLRIIDKLI 
FAHEWKAW 
FTLEWDALV 
FTLEWDAW 
FT LEWD AL I 
LLLRWDKLI 



EAVGADR- 
ATVGEDN- 
GIVGAEK- 
EAVGAEK- 
EAIGHEK- 
DAIGPEK- 
ETIGPER- 
EWGANR- 



VGVK 
-LAIR 
-LAVR 
-TGIR 
-VGLR 
-VGLR 
-VGLR 



LSPY-SQYL- 
LSPF-GLFN- 
LAPW-SSFL- 
LSPY-STFQ- 
LSPY-GVFN- 
LSPY-GVFN- 
LSPY-GTFN- 
LSPW-ASFQ- 



GMGTMD 

0ARG 

- - -GMEIEG 
GMKMKK 



-SMSGGA 
-SMSGGA 
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431 



10- 



SEQ 3 
SEQ 6 
SEQ 8 
SEO 10 
SEQ 12 
SEQ 14 
SEQ 16 
SEQ 19 
SEQ 22 
SEO 24 
SEO 27 
SEO 3 0 
SEO 33 
SEQ 3S 
SEO 38 
SEO 40 
SEO 42 
SEQ 44 
SEO 83 
SEQ 85 
Bacteria 
T44612 
NP_625402 
NP_295913 
AF320254 
OYE family 
Af4875 
Af4961 
Ca2460 
NC4452 
ScOYEl 
SCOYE2 
SCOYE3 
A36990 



SWKSEDTVR- 
SWRGVDTVR- 
SWDM-QSSL- 
SWKLSDSVR- 
AWTIEDSKK- 
AWSTEDALK- 
SWDLQQTI - - 
TDTAEBVLK- 
SWDLEQSTQ- 
GWEIEDTVAP 



FAQELVK 
-FAKILA 
ELVKKLP 
FAEALAA- 
-LADILV- 
-LADLVI- 
ELAKILP- 
-QIELFE- 



TLAARLR 



-0 GAVDLIDISS 
BT GYVDVLDVSS 
-B WGIDLVDVSS 
-0 GAIDLIDVSS 
-E KGIALVDVSS 
- D LGVKVIDVTS 
-D LGVDLLDVSS 
-Q WGIDFVEVSG 
-D LGVDLLDVSS 
-D GGVDLIDVSS 



#######«## ########## 

GGVLAOO KI 

GGTHSEQ HI 

AANHKDQ KI 

GGVHAAQ --KI 

GGNDYRQPP RSG I SK 

GGNVAHCKS RY LLND 

- KI 



GSYEDPQMAN GPKPEKSERT 

GGNSVAO KI 

GGNHKDQ RI 



451 



KSGPAFOVPF 
HAKPGFOAPF 
NLHTAYQTDL 
KSGPAFQAPF 
BLREPIHVPL 
DKQLPSQVPL 
NVHTYYQIDM 
MAREAFFLEF 
ELTPYYOIDL 
EVKDCYQVPF 



AVAVKKAVGD KLLVAAV GAIT 

AIAVKNAVGD KLAVASV GMIA 

AGOIROAI RAAGAST LVGAVGLITD SEQARGLVQG 

AVAIKKAVGD KLLV ATV GTIT 

SRAIKOHVGD KLLVSCV GGLB 

ARKLKSHIRN ---RCLIACS GGLD 

AEQIRAAVHE AGKQLLVGAV GLVT- 

AKIIRTK -FPKLPLMVT GGFR- 

AAKIREAVGD RLLIGAV GNIN- 

AEKIKDQVNG ILLGAV GMIR- 



-SA BIAKETVOEK 



SWTVEQTC- - QLARILP- -K HGVDLVDVSS GGIHPKS - 



-AIAI KSGPAYQVDL AKQVKKAVGD ---SVLVSAV GGIK- 



SWDVESTIK- -ISKILA--D LGVDLLDVSS GGNHPQQ- - 

FKP-EEAVQ- LCEALEAAGM DFVBTSG GTYESFG-- 

TWTLEQSIK- -LAHQLA--D RGVDVLDVSS GGIHKMQ-- 

SWTVDOTVE- LAKMLQB ARVDLLDVSS GGLVPFQ- - 



KI NMFNT - - 

-FAHRKESS RKRENYFIEF AEVIRKAVKH MWYTTG GFKT- - 

KV AAGPGYOAPL AKAIKKSVGD KMLISTV GSIK-- 

KI TVGAGYOLFG AKAVRDALAK - - IBPDASKR ML VGA - 



EQTLEES I - - ELARRFK- -A GGLDLLSVSV GFTIPET- 

GWTPDDTVR- -FARDLB--A HGIDLLDVST GGNVPRV- 

GWDLEQTVO- -LSKLLK--Y EGVDVLDISS GCLTAAQ- 

GNTADDAVA- -IARLFK--E AGADIIDCSS GQVWKGD- 



-NI PWGPAFMGPI AERVRREAKL -• 

-RI PTGPGYOVPF AARVKAGST- -• 
-01 EVGPGYQVPF AAAVSRAETE 

-OP VYGRMYQTPF ADRIRNEVGI -- 



--PVTSAW GFGT- • 
--LPVAAV GLIT- 
--ISVMAV GLIE- 
--PTLAVG AISE- 



EL--VPQFEY 
EQR - VETWTF 

EE IHSY 

DLIP--QFED 
ETGIVAQYAY 
ETGIVAQYAY 
EPGI IAQYSY 
EE IHSY 

501 



LI A QM 

LCESLKKAHP 
ILOOLOORAD 

VIRKIN 

VAGELEKRAK 
VLGELERRAK 
VLGELEKRAK 
ILQQLOQRAD 

511 



RRLDVAYLHL 

NLSYVSF 

NGOQLAYVSL 
-GFGLAYLHL 
AGKRLAFVHL 
AG KR LAFVHL 
AGKRLAFVHL 
NGOOLAYISL 

521 



ANSRWL 

IEPRYE 

IEPRVIG - - 
TQSRVAGN- 
VEPRVTNP- 
VEPRVTNP- 
VEPRVTDP- 
VEPRVTG 

53 1 



DE EKPHPDPNHE VFVRVWG-Q- --SS-PILLA GGYD 

-QIFSYEBKD NFLRSWG LSDVDLSSFR KIFGTTPFFS 

- 1 FDASL EDQKGRSNEF AYKYWKG NFVRA GNYT - 

"MDVOP EEDEE-NLAF AAKLWDG PLLIA GGLT 

-FLTEGE GEYEGGSNDF VYSIWKG --PVIRA GNFA 

-FLTEGE GEYNGGSNKP AYSIWKG PI IRA GNFA 

-SLVEGE GEYSEGTNDP AYSIWKG--- PI IRA GNYA - 

-IYDVSL KDQOGRSNEF AYKIWKG- - NFIRA GNYT 



551 



561 



571 



581 



SEQ 3 
SEO 6 
SEQ 8 
SEQ 10 
SEQ 12 
SEQ 14 
SEQ 16 
SEQ 19 
SEQ 22 
SEQ 24 
SEQ 27 
SEQ 30 
SEQ 33 
SEO 3 5 
SEQ 38 
SEQ 4 0 
SEQ 4 2 
SEQ 44 
SEQ 83 
SEQ 85 
Bacteria 
T44612 
NP_625402 
NP_29S913 
AF320254 
OYE family 
Af4875 
Af 4961 
Ca24 6 0 
NC4452 
SCOYE1 
SCOYE2 
SCOYE3 
A36990 



--NGKQ--AN QILEEQD- - --- IDVALVG RGFOKDPGLA WTFAQHLGV- 

S AH LANS LLEKDG LDLVLVG RGFQKNPGLV WAWADELNV- 

ABEATAAEAM LSGPEPK- - - - ADA I LI A RQFLREPEWV FSTARKLGV- 

--NGKQ--AN KLLEEEG LDVALVG RGFQKDPGLA WTFAQHLDV- 

KDPELLN KYLEEGT - FDLALIG RGFLRNPGLV WEFAD KLG V - 

RDIFKLD EFIANGD FDIALIG KGFLKNTGLI SRIADQLQA- 

E-DGRVTIQR ENGAKTR - -ADMVLVA RQFLKEPEFV LTVADELGV- 

TRQGME AALESDD CDMIGIG RPAIINPSLP ANLILNPEV- 

- -TAD I - -AR DWDEQGAEK VAEAKQTHDT I EWSESHGG KTKADLVL I A RQFLREPEFV LRTAHNLGV- 
--DGLFTTAN EILESGK ADVTFVA REFLRNPSLV LDSANQLGE- 



-BISMAN 
-EISMAN 
-PVTVPV 
-EIAMAS 
-RLHQAL 
-QFRTAP 
-DVKAPV 
-PDADAR 
-NVQWPH 
-NVAWPV 



QIRWGFTRRG 
QIRWGFSRRG 

QFGRAI 

QIRWGFTRRG 
QLGWGFWPNK 

QYKLALS 

QYLRGPLSSR 
LFDKKRAEPH 
QYHRAVWRKG 
QYDYAVKGHR 



-TGHL--AE EVLQSG --- IDIVRAG RWFQQNPGLV RAFANELGV- 



-EVKMAN QIDWSFKGRG 



--VGAM-VDA LQGVDG 

- - IGTL- -AE EI IAGG 

- - VGMM- - EG S YDS PNG - 

-PQLAE AALOANO- 

- EPG QAE KILANGE- 

--TGA--QAE AILQAGD- 
- - -AD- -HAN SIIAAGR- 



- IGIG RAAGSEPDLA KDIIAGKVSS I I KYAMGEDE FVLQLTACSA OIRLMAKGEE 

DTPLDLVASG RLFQKNTGLV WSWADDLNT SIQIAH QIAWGFGGRA 

---ODRSQIG KLAEQSIQSG ECDAVLLAR- - - -GLMSYPS WTEDASVALM GTRAAGNPQY 



-LDLVSVG RAHLADPHWA Y FAAKELG V - 
-ADAVLLG RELLRNPSWA QHAARELGV- 
-ADLIALG RPFLRDPHWA QRAARELGL- 
-ADLCAIA RPHLADPAWT LHEAAKIGF- 



-EKASWT LPAPYAHWLE 

-DARMPD QYGWGM 

-RPVSID QYARAGW 

-GEVAWP KQYRSARGQY 



AASAEKVTEQ 
AGGWDQSNSW 
- YDAPEFKTL 
-PETAK-HLV 
-LHP---EW 
-LHP---EW 
-LHP- --EW 
- YDAPEFKTL 



MAAATYT- 
GVLEEGR- 
LHDLDND- 
DREFPEK- 
REEVKDK- 
REEVKDP- 
REQVKDP - 
INDLKND- 



-NVAIAFG 
-YDALLYG 
-RTIVGFA 
-DWATPG 
-RTLIGYG 
-RTLIGYG 
-RTLIGYG 
-RSIIGFS 



RYFI STPDLP 
RYFTSNPDLV 
RFFTSNPDLV 
RHFI STPDLP 
RFFISNPDLV 
RFFISNPDLV 
RFFISNPDLV 
RFFTSNPDLV 



FRVMAGIQL- 
ERLRKGIPF- 
EKLKLGKPL- 
FRIKEGIEL- 
DRLEKGLPL- 
DRLEKGLPL- 
YRLEEGLPL- 
EKLKLGKPL- 



-QKYDRA 
-TPYDRS 
-NHYDRE 
-NPYDRD 
-NKYDRD 
-NKYDRD 
-NKYDRS 
-NYYNRE 



S FYSTLS REG 
RFYGPFEDNA 
EFYKYYNY-G 
TFYKAKSPDG 
TFYOMSAH-G 
TFYKMSAE-G 
TFYTMSAE G 
EFYKYYNY-G 
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SEQ 3 
SEQ 6 
SEQ 8 
SEQ 10 
SEQ 12 
SEQ 14 
SEQ 16 
SEQ 19 
SEQ 22 
SEQ 24 
SEQ 27 
SEQ 30 
SEQ 33 
SEQ 35 
SEQ 38 
SEQ 40 
SEQ 42 
SEQ 44 
SEQ 83 
SEQ 85 
Bacteria 
T44612 
NP_6 25402 
NP295913 
AF320254 
OYE family 
Af4875 
Af 4961 
Ca2460 
NC4452 
SCOYE1 
SCOYE2 
ScOYE3 
A36990 



GTPYIDPSVY KQSIFDV 

AGPYLRKKLE KI - 

GTPYIDPKAY KESIFE -- 

QQIVDLIERT SKLEVN 

PKKLTTVP- - 

HI VEKIjGMKS IVGAGVEVTW yvselkklak f- 

Afil 

- -KLR - 



PFDISNADEV ARVTQLMAEG KV- 

KKNAPKLVL 

KRVHVAKK- - 



ETNUJRAAAA VAGK - 



YLDYPPSAEY 
KCYVDY P PAT 
YNSYDESEKQ 
YIDQPFSKEF 
YIDYPTYEEA 
YIDYPTYEEA 
YTDYPTYEEA 
YNSYDESEKQ 



MAUOJPPV- 

ASS- 

VIGKPLV- - 
EKVYGAQA- 
LK1GWDKK- 
LKLGWDKN - 
VDLGWNKN- 
VIGKPLA-- 



Figure 1. A multiple alignment of the 2031 OR amino acid sequence 
from A. fumigatus (SEQ ID No3) along with related 2031 ORs from 
other fungi and bacteria (see Example 4) and OYEs . Regions 1-11, 
marked with * or #, refer to amino acids conserved between ORs 
but not OYEs . 



Fungal 2 031 ORs are given by the following SEQ ID No.: A. 
fumigatus, SEQ ID Nos . 3, 6 and 8; A. nidulans , SEQ ID No. 10; C. 
albicans SEQ ID Nos. 12 and 14; N. crassa, SEQ ID Nos. 16 and 19; 
M. grisea SEQ ID Nos. 22 and 44; S. pombe SEQ ID No. 24 
(NP_595868) ; C. trifolii SEQ ID No. 27; F. sporotrichioides SEQ 
ID Nos. 30, 3 3 and 35; F. graminearum SEQ ID Nos. 3 8 and 83; M. 
graminicola SEQ ID Nos. 40 and 42; U. maydis SEQ ID No 85. 

Bacterial ORs resembling 2031 are: 

T44612 (Pseudomonas putida) , SEQ ID No. 86; NP_625402 
(Streptomyces coelicolor) , SEQ ID No. 87; NP_295913 (Deinococcus 
radiodurans) , SEQ ID No. 88; AF320254 {Azoarcus evansii, SEQ ID 
No. 89. 

Fungal ORs similar to the Old Yellow Enzyme family (originally 
identified in S. cerevisiae) : 

A. fumigatus, Af4875 and Af4961, SEQ ID Nos. 90 and 91 
respectively; C. albicans, Ca2460 and A36990, SEQ ID Nos. 92 and 
93 respectively; N. crassa, Nc4452 7 SEQ ID No. 94; S. cerevisiae, 
OYE1, OYE2 and OYE3 , SEQ ID Nos. 95-97 respectively. 

Details of the sequence searches that identified the ORs other 
than SEQ ID No. 3, and methods for the construction of multiple 
alignments are given in Example 4 hereinafter. 
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1 11 21 31 41 51 61 71 81 91 

SEQ 1 GTTCGACGTC ATTGCCACGT TTCGACCCAA GGGCAGACGC CATGTCGCCG AGCGATCGCC GCGATATGCC TCGAATTTGC GCCATTCGGC ATCCAGTTTC 

SEQ 2 

SEQ 4 - - 

SEQ 5 --- --- 

SEQ 7 

SEQ 9 

SEQ 11 

SEQ 13 

SEQ 15 

SEQ 17 

SEQ 18 

SEQ 20 -- - - - -- - 

SEQ 21 

SEQ 23 

SEQ 25 

SEQ 26 -- 

SEQ 28 - 

SEQ 29 

SEQ 32 

SEQ 34 - 

SEQ 36 

SEQ 37 - 

SEQ 39 -- 

SEQ 41 

SEQ 43 - " 

SEQ 82 --- - - 

SEQ 84 

101 111 121 131 141 151 161 171 181 191 

SEQ 1 CAGTGCCCTT CCCCGAATGA CTGTCTCCAC TATTCGGCAA GATTGTAAAT CAAGCCTGAA GAAGCGGAGC AATTCTTGGA AGTCGTATGT TCTACTGATT 

SEQ 2 GTATGT TCTACTGATT 

SEQ 4 - 

SEQ 5 - - - 

SEQ 7 

SEQ 9 

SEQ 11 - 

SEQ 13 -- - " 

SEQ 15 

SEQ 17 - 

SEQ 18 " - 

SEQ 20 - 

SEQ 21 - 

SEQ 23 - 

SEQ 25 - CGAAA CCTCGACCCA AACAAACAGC 

SEQ 26 

SEQ 28 - GAAC 

SEQ 29 

SEQ 32 - 

SEQ 34 AGGAAG TTGCATGTCA CTTGTAGTGA CAGGGCGTCG TGTAAATTTT ATAAATACCT ATACTTGTTT GTTCACTTCT ATGCTACTCA TATCAATCCG 

SEQ 36 - -- " 

SEQ 37 

SEQ 39 - - 

SEQ 41 - 

SEQ 43 

SEQ 82 --- - - 

SEQ 84 

201 211 221 231 241 251 261 271 281 291 

SEQ 1 TCTGTGCCTG GCGCAGACGG GTATATAAAT AAAGATCACC GCACCGAGGA GTTTCTTACC AACCCATCAA TAACCATCCA CAATCTCCTA CAACAAAAAT 

SEQ 2 TCTGTGCCTG GCGCAGACGG GTATATAAAT AAAGATCACC GCACCGAGGA GTTTCTTACC AACCCATCAA TAACCATCCA CAATCTCCTA CAACAAAAAT 

SEQ 4 - A TGTCGCAACC 

SEQ 5 A TGTCGCAACC 

SEQ 7 -- -- --A TGGGTTCCAA 

SEQ 9 - - - -- AT 

SEQ 11 --- - ATGACAG TTCCATACCA 

SEQ 13 -- - 

SEQ 15 - --A TGGCCGACTT 

SEQ 17 - 

SEQ 18 

SEQ 20 ATGTC 

SEQ 21 - - ATGTC 

SEQ 23 - - 

SEQ 25 TGACCCTCTC CTTGACAACA AAGCCGGCCA TCCTCGCCGA CGATTGCCTC TACCCCCGCA TAGTCACACT CGCACGTCCG TTCTCCCACC GTCAAACAGA 

SEQ 26 -- 

SEQ 28 TGCTGTAGAT GTGGTTGAAT TGGTATATTA GACCGGAGTA CTCTATATGC GAGAGACTAT ACATTGAAGT TGCCAACGTT CTTCCAGATT GATTAATCAT 

SEQ 29 AT 

SEQ 32 - - - 

SEQ 34 AGAAGATCAA ACAGTCCCCT ATACACACTT GTCAAGACCT ATCTATTATT TCAAAAATCA GCAATATGGC TGAGACAATG CCTAAGTGTG AGGCAAATGG 

SEQ 36 - - - 

SEQ 37 

SEQ 39 

SEQ 41 - 

SEQ 43 --- - - 

SEQ 82 --- ---ATGACAG TTCAATCACA GCAACAATCC CAGGCTATTC CCGTCCTTTC TTCCCAGAAT GGCACTGAAC CCCAAGACGC 

SEQ 84 - AT GGACACGTCT CGATTCGTGT CTGGTCTCAC 
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CCATCACAAA 
-GCACGAGGG 
-GCACGAGGG 


ATCATCATCA 
ATTATTGACA 
ATTATTGACA 


ATAAGGAAGC 
ACATCGCGGC 
ACATCGCGGC 


TCCGAATGTT 
TGAAGGGGCT 
TGAAGGGGCT 


C CTTTCT ATA 
CCCTACTACA 
CCCTACTACA 


CTCCAGTGCA 
CGCCTGCTCA 
CGCCTGCTCA 


AGATCCACCA GCAGGAACGT CTTACGATGT 
AGACYCTCCA GCAGGCACAC AGACCAGCGG 
AGACYCTCCA GCAGGCACAC AGACCAGCGG 


TCAGCCTGAA 
CTCAACCA- - 
CTCAACCA- - 





SEO 1 GACTGTCGCC GATATCGACG TTCCTCCTGC CGAGGGCATC CCCTACTTCA CTCCGGCCCA GAACCCTCCT GCCGGTACGG CAGCTAACCC CCAGACCAAT 

SEQ 2 GACTGTCGCC GATATCGACG TTCCTCCTGC CGAGGGCATC CCCTACTTCA CTCCGGCCCA GAACCCTCCT GCCGGTACGG CAGCTAACCC CCAGACCAAT 

SEQ 4 TGTTGTGCCT GACATCGAGA ACAAACCCGC GCCGGGTATC TCGTACTTTA CTCCGGCGCA AGAGCCGCCT GCTGGCACCG CTGCTAATCC TCAGTCTGAT 

SEQ 5 TGTTGTGCCT GACATCGAGA ACAAACCCGC GCCGGGTATC TCGTACTTTA CTCCGGCGCA AGAGCCGCCT GCTGGCACCG CTGCTAATCC TCAGTCTGAT 

SEQ 7 CGCCTTCCGG TCCCCCGCCG TCACCAAGTC CTCCTCCACC CCCTACTACA CTCCCGCCAA CAATGGAGGC GCCGCCCTGC ACCCCGACGA CCCCAC 

SEQ 9 GGCTCTCCCT GACGTCGAAA ACACCCCCGC CGCCGGCATC CCCTACTTTA CACCAGCACA GAACCCTCCT GCTGGAACAG CTGCCAACCC GCAAACCAGC 

SEQ 11 AGTAAAACCA TCAGATGAAA TCAAAGGTGC TCCTGAGGTT TCCTATTACA CTCCAGAACA GCCTGTTCCG GCTGGTACTT TTTATCCCCA ATCGTC A 

SEQ 13 - - - ATGGAAA ACAACAATAC TATACCG 

SEQ IS CACCCAGAAG AAGACCTCCT CCCCCGCGGC CCCGGGTGTT CCCTTCTACA CCCCGGCCCA GGTCCCCGCC GCCGGCACTC CCCTCCCCTC CACCCCC 

SEQ 17 ATGGCTACTT CCACTACCTC CGACCTC 

SEQ 18 -- - - - ATGGCTACTT CCACTACCTC CGACCTC 

SEQ 2 0 GGCAGAAAAG AAGACTTTGA GCAAACCGGC CGCCGGGGTG CCTTACTACA CCCCAGCCCA GGAGCCGCCG GCAGGGACCC CTTTGCAGCA GCAGGACG- - 

SEQ 21 GGCAGAAAAG AAGACTTTGA GCAAACCGGC CGCCGGGGTG CCTTACTACA CCCCAGCCCA GGAGCCGCCG GCAGGGACCC CTTTGCAGCA GCAGGACG- - 

SEQ 23 ATGAC TATTGTTAAT GAAGGAGCCG AAAATGTTGG TTATTTTACA CCTGCGCAAA AAATACCAGC TGGAGCGGCG ATAGGTGTAC CGCAAA 

SEQ 25 CAGCATGACG GGCACCGCGA ACAAGGCCGC CCCCGGTGTG CCGTTTTACA CCCCGGCCCA GGAGCCTCCC GCGGGAACGC CAGTCGACGC CAGCACGG - - 

SEQ 2 6 ATGACG GGCACCGCGA ACAAGGCCGC CCCCGGTGTG CCGTTTTACA CCCCGGCCCA GGAGCCTCCC GCGGGAACGC CAGTCGACGC CAGCACGG- - 

SEQ 28 GGCTTACGAG ATAATCGACA ACGTTGCGGC TGAAGGGGTT C CAT ATT ACA CACCGGCTCA AGACCCGCCA GCTGGTACGC AGACAAGCGG CTCAACG 

SEQ 29 GGCTTACGAG ATAATCGACA ACGTTGCGGC TGAAGGGGTT C CAT ATT AC A CACCGGCTCA AGACCCGCCA GCTGGTACGC AGACAAGCGG CTCAACG 

SEQ 32 
SEQ 34 
SEQ 36 
SEQ 37 
SEQ 39 
SEQ 41 

SEQ 43 ATGT CCCCACCACG CTTCGAAGCG GCCCCTGCCG ACCCCTCACC GCTCGGC 

SEQ 82 AAACAAGGAG GTTGTTCAGA ATGTCGCTGC CAAAGGAGTG CAATACTTCA ACCCTGAGCA ACTTCCTGCA CCAGGTCTCG GTATAAACGG TCCCAAT 

SEQ 84 ACCGCCTCTC GTCGACTCGA TCGATGCACT CAAGATCAGC AACTTTGTCC CCACTCGAAG TGGCCACCCT CCTCCTGGCT CGGTCCCGGA ATCCATCCTG 



SEQ1 GG- CC AGAAGATCCC CAAGCTCTTC ACGCCCTTGA CCATCCGTGG CGTCACC -TTCCAGAAC CGCCTTGGTG 

SEQ 2 GG CC AGAAGATCCC CAAGCTCTTC ACGCCCTTGA CCATCCGTGG CGTCACC -TTCCAGAAC CGCCTTGGT- 

SEQ4 GG AT CGGCACCTCC CAAGCTCTTC CGGCCG CTTT CGGTGCGGGG TCTGACC -TTTCACAAT CGCATTGGCG 

SEQ 5 GG AT CGGCACCTCC CAAGCTCTTC CGGCCGCTTT CGGTGCGGGG TCTGACC - -TTTCACAAT CGCATTGGC- 

SEQ7 GACCCC TACGCTCTTC CGGCCCTTAC AAATCCGCAA TGTGACG -CTCAAGAAC CGCATCATG - 

SEQ 9 GG CA ATGCCGTCCC CAAGCTGTAC ACACCTCTGA CGGTGCGTGG GGTGACC -TTCCACAAC AGACTTGGC- 

SEQ 11 GA- TG AAGTTGCTCC CAAAATTTTT CAACCTTTAA AGATTGGTAA GCTTGCT- -TTGCCAAAC AGAATTGGG- 

SEQ 13 -GCATTATTT CAAC C CAT AA AGATCAGTGA CTCGATC AC ATTACCTAAT AGAATTGGT- 

SEQ 15 G GCGATGTCCC TACTCTCTTC ACCCCTCTCA AGATCCGTGG TGTTGAG CTCCAGAAC CGCTTCGCC- 

SEQ 17 AAACTCTCC CAACCCCTCA CCCTCCCCAA TGGCCTT- AC CCTCCCCAAC CGCCTCGTC- 

SEQ 18 AAACTCTCC CAACCCCTCA CCCTCCCCAA TGGCCTT AC CCTCCCCAAC CGCCTCGTC- 

SEQ 20 CCATCCC AACGCTGTTC AAGCCTCTGA AGATCCGTGG CGTCGAG - -CTCTCCAAC CGCTTTGGC- 

SEQ 21 CCATCCC AACGCTGTTC AAGCCTCTGA AGATCCGTGG CGTCGAG -CTCTCCAAC CGCTTTGGC- 

SEQ 23 - - C AAAATTATTT ACTCCTCTTA AAATTAGAGG AGTGGAG -TTCCATAAC AGAATGTTT- 

SEQ 25 CTCC GACGCTCTTC AAGCCCCTCC GCATCCGCGA CCTCACC - ATCAACAAC CGCATCTGG- 

SEQ 26 CTCC GACG C TCT T C AAGCCCCTCC GCATCCGCGA CCTCACC -ATCAACAAC CGCATCTGG- 

SEQ 28 -AAGCTATTC ACACCCATCA CCATCCGCGG CGTCACA- -TTCCCAAAC CGCCTCTTC- 

SEQ 29 -AAGCTATTC ACACCCATCA CCATCCGCGG CGTCACA -TTCCCAAAC CGCCTCTTC- 

SEQ 32 - -- - - - 

SEQ 34 GG AAGCCTATTC TCTCTTATTA AAATAAGAAA CCTGACT -CTTCAAAAC CGGATTTTT- 

SEQ 36 - AGGTTTTC ACACBCATCA CCATCCGAGG CGTCACA -TTCCCAAAC CGTCTCTTT- 

SEQ 37 --- - --AGGTTTTC ACACBCATCA CCATCCGAGG CGTCACA TTCCCAAAC CGTCTCTTT - 

SEQ 39 CCTCA AGATCCGAGG TCTTACC -CTCCAGAAC CGTATTATG - 

SEQ 41 

SEQ 43 ACGC CGCTCAAATA CCCCGTCTCG GGGCGGTCG- -GCGCCCAAC CGGTTCCTC - 

SEQ 82 A ATACTCTACC AAAGGTCTTT ACACCCATCA AGATTCGCGG CATGACC - - - - -- ATGCCCAAC CGTATCTGG - 

SEQ 84 CCAGAGGGTG TCAAAAAACC GGCTTTGTTC CAAACGTTGA CATTGCCCTT TGCTGCACCG GAACAGGCGG GTAAGATGAC CTTCAAGAAC CGCATCATT- 



SEQ 1 TAAGTCCGTT TGCCCTTGCT CATATCGACG AAAGCTAATC CCCCGTCAG- CTCGC GCCCCTCTGC 

SEQ 2 -- --- CTCGC GCCCCTCTGC 

SEQ 4 TGAGTGCAGT CCAGGCAATT ATGCTATCCA TCCTATGCGA GCCCTTGCAT TGGAACAGCC GCTTACAGGG AATGATAATG AGTAGCTATC GCCACTCTGC 

SEQ 5 CTATC GCCACTCTGC 

SEQ 7 - GTGTC GCCCATGTGC 

SEQ 9 --- CTCGC GCCGCTCTGC 

SEQ 11 GTATC TCCAATGTGT 

SEQ 13 GTTTC ACCAATGTGC 

SEQ 15 GTTGC GCCCATGTGC 

SEQ 17 - -- AAAGC CGCCATGGCC 

SEQ 18 AAAGC CGCCATGGCC 

SEQ 20 - GTCTC GCCCATGTGC 

SEQ 21 - - GTCTC GCCCATGTGC 

SEQ 23 -- - - GTTTC GCCGATGTGC 

SEQ 25 - - -- GTCAG CCCCATGTGC 

SEQ 26 GTCAG CCCCATGTGC 

SEQ 28 - - - - - CTTGC CCCTCTCTGC 

SEQ 29 CTTGC CCCTCTCTGC 

SEQ 32 

SEQ 34 - - - GTCTC CCCAATGTGT 

SEQ 36 CTTGC CCCTCTCTGT 

SEQ 37 --- - - -- --- -- CTTGC 

SEQ 39 - -- TTGAG 

SEQ 41 - - - - 

SEQ 43 --- - - AACGC GGCCATGTCG 

SEQ 82 GTCAG CCCCATGTGC 

SEQ 84 GTCTC TCCCATGTGC 
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SEQ 1 CAATACTCCG CC CAGGACG GCCACATGAC CGAC TACCACATCG CCCATCTGGG TGGGATCGCC CAACGCGGAC 

SEQ 2 CAATACTCCG CC CAGGACG GCCACATGAC CGAC TACCACATCG CCCATCTGGG TGGGATCGCC CAACGCGGAC 

SEQ 4 CAATACTCAG CC GACGATG GACACATGAC TCCC TGGCATATGG CACATCTTGG AGGGATTGCC CAGCGAGGGC 

SEQ 5 CAATACTCAG CC GACGATG GACACATGAC TCCC TGGCATATGG CACATCTTGG AGGGATTGCC CAGCGAGGGC 

SEQ 7 ATGTACTCCT GCGAGTCGGA CCCGTCGTCT CCCCACGTCG GCGCCCTAAC AAAC TACCACCTGG CGCATCTGGG CCACCTCGCC CTCAAAGGCG 

SEQ 9 CAGTACTCCG CA GAAGACG GCCACATGAC AGAC TACCACATCG CGCACTTGGG AGGTATTGCC CAGCGCGGCC 

SEQ 11 CAATATTCTG CT - GATTATAATT TTGAAGCAAC TCCA TACCATTTAA TCCATTATGG TTCATTAGTG AATCGTGGGC 

SEQ 13 ATGTATTCAT CG TCA CCAACTGACA ATCAAGCCAC TCTG TTTCATTTTG TTCATTATGG ATCATTTGCT GTACGTGGAC 

SEQ 15 ACCTACTCTG CC ---GACGATG GCCACATGAC CGAC TGGCACCTTG TCCACCTGGG CTCCTTCGCC CTCCGCGGTG 

SEQ 17 GAACAAATGG GC TTCGGCA ACCACCTGCC CAAC CCCGAACTCG CCGCCGTCTA CGCCACCTGG GCCCGCGGCG 

SEQ 18 GAACAAATGG GC ---TTCGGCA ACCACCTGCC CAAC- CCCGAACTCG CCGCCGTCTA CGCCACCTGG GCCCGCGGCG 

SEQ 20 ACCTACTCAG CC GACGATG GCCACCTGAC CGAC TTCCACTTGG TGCACCTGGG CCAGTTCGCC CTGCACGGCA 

SEQ 21 ACCTACTCAG CC GACGATG GCCACCTGAC CGAC TTCCACTTGG TGCACCTGGG CCAGTTCGCC CTGCACGGCA 

SEQ 23 ACTTATTCCG CT GACCAAGAAG GGCATTTGAC AGAT TTTCACCTAG TACATCTTGG AGCGATGGGA ATGCGTGGGC 

SEQ 2 5 CAGTACTCCG CC -GACAATG GCCACGCGAC CGAC - TACCACCTCG TCCACCTGGG CCAGTTCGCC CTGCACGGCG 

SEQ 2 6 CAGTACTCCG CC - - - GACAATG GCCACGCGAC CGAC TACCACCTCG TCCACCTGGG CCAGTTCGCC CTGCACGGCG 

SEQ 28 • CAATACTCCG CC AAAGATG GTTATGCCAC TGAT TGGCACTTGA CTCACCTCGG GGGAATAATC CAAAGAGGCC 

SEQ 2 9 CAATACTCCG CC -AAAGATG GTTATGCCAC TGAT - TGGCACTTGA CTCACCTCGG GGGAATAATC CAAAGAGGCC 

SEQ 32 

SEQ 34 CAATATTCAG CA AAGGATG GTGTCATGAC CCCC TGGCACAAAC AACACCTGGG CAGCTTCGCA GCACGAGGTC 

SEQ 36 CAATACTCCG CC- AAAGATG GATATGCTAC TGAT TGGCACTTGA CTCATCTCGG AGGCATTATC CAACGAGGCC 

SEQ 37 CAATACTCCG CC AAAGATG GATATGCTAC TGAT TGGCACTTGA CTCATCTCGG AGGCATTATC CAACGAGGCC 

SEQ 39 CAGTACTCTG CT CCCGACG GACACTACAC AATG TGGCATCACA CCCACATGGG CGGCATCATC CAACGCGGTC 

SEQ 41 --- - 

SEQ 4 3 GAGGGCCTGG CG ACGTT TGACGAGGCG GACCCGTCCA AGCGCGGCAT CCCGACGGAG CAGCTGGTGC AGCTGTACCG GCGCTGGGGC CAGGGCGAGT 

SEQ 82 CAATACAGTG CC CGTGACG GCTTTCAGCA GCCT TGGCACTTTG CCCACTACGG CGGACTGGCC CAACGTGGCC 

SEQ 84 CAGTACTCTG CG - AACAATG GTCTTCCTAC TCCG- TACCACATTG CGCATTTGGG ATCGTTTGCC CTGCACGGTG 



SEQ 1 
SEQ 2 
SEQ 4 
SEQ 5 
SEQ 7 
SEQ 9 
SEQ 11 
SEQ 13 
SEQ 15 
SEQ 17 
SEQ 18 
SEQ 20 
SEQ 21 
SEQ 23 
SEQ 2 5 
SEQ 2 6 
SEQ 28 
SEQ 2 9 
SEQ 32 
SEQ 34 
SEQ 36 
SEQ 37 
SEQ 3 9 
SEQ 41 
SEQ 4 3 
SEQ 82 
SEQ 84 



CCGGCCTGAT GCTGATTGAG GCGACCGCCG TCCAGCCCGA A- 
CCGGCCTGAT GCTGATTGAG GCGACCGCCG TCCAGCCCGA A- 
CAGGATTCTT GATGGTCGAG GCAACAGCAG TCGAACCGGA A- 
CAGGATTCTT GATGGTCGAG GCAACAGCAG TCGAACCGGA A- 
CAGGCCTCGT CTTCATCGAA GCCACCGCCG TGCAGCCCAA C- 
CCGGTCTCAT GATGATCGAG GCAACCTCCG TCTCACCTGA A- 
CAGGTATCAC CATTGTTGAA AGCACGGCTG TTTCTCCTGA G- 
CAGCATTAAT CATTTTAGAG AGTATCTTTG TGTCCGAAAA T- 
TCCCCCTCAC CATCTTCGAG GCCACCGGCG TCCTCCCCAA C- 



-GGCCGC ATCACCCCTC AGGATGTCGG TCTGTGGAAG GACTCC CA 

-GGCCGC ATCACCCCTC AGGATGTCGG TCTGTGGAAG GACTCC CA 

-GGCAGG ATCACCCCGC AGGACCTGGG ACTATGGAAA GACTCG CA 

-GGCAGG ATCACCCCGC AGGACCTGGG ACTATGGAAA GACTCG CA 

-GGGCGC ATCTCCCCCA ACGACTCGGG CCTCTGGCAG GACGGCACCA CCTCGGAACA 

-GGCAGA ATCACGCCGC AGGACGTCGG TTTATGGAAG GACTCG CA 

-GGTGGA TTATCACCTC ATGATTTAGG AATCTGGAAG GATGAA CA 

-TCCGGA TTATCCATTC ATGATTTAGG TCTTTGGAAT GATGAT-- CA 

-GGCCGC ATCACCCCCG AGTGCTCTGG TCTCTGGCAG GACTCC CA 

ACTGGGGCCT GATTCTCACC GGCAACGTCC AAGTCGACCA CGCGCACAAG GGCGACGCCC ACGACATCAG CCCCAACCAC CCCGGCACCA CGCCCGAGCA 
ACTGGGGCCT GATTCTCACC GGCAACGTCC AAGTCGACCA CGCGCACAAG GGCGACGCCC ACGACATCAG CCCCAACCAC CCCGGCACCA CGCCCGAGCA 

CGGCCCTGAC CATTGTCGAG GCCACATCCG TCACGCCCAA C GGACGC ATCTCGCCCG AGGACAGCGG CCTGTGGCAA GACAGC CA 

CGGCCCTGAC CATTGTCGAG GCCACATCCG TCACGCCCAA C-- -GGACGC ATCTCGCCCG AGGACAGCGG CCTGTGGCAA GACAGC CA 

CTGGCCTTGT AATGGTAGAA GCGACAGCGG TTTCCCCAGA G---GGACGA ATTTCACCTA ATGATTCAGG ATTATGGATG GAGTCG CA 

CCGCCCTGTC CATGGTCGAG GCCACCGCCG TCGAGGCTCG T GGCCGC ATCTCCCCCG AGGATGTCGG TTTGTGGCAG GACTCG CA 

CCGCCCTGTC CATGGTCGAG GCCACCGCCG TCGAGGCTCG T GGCCGC ATCTCCCCCG AGGATGTCGG TTTGTGGCAG GACTCG CA 

CCGGATTGTC CATGGTGGAG GCTACCGCTG TACAAAACCA C GGTCGC ATCACACCTC AGGATGTTGG TCTGTGGGAA GACGGC CA 

CCGGATTGTC CATGGTGGAG GCTACCGCTG TACAAAACCA C GGTCGC ATCACACCTC AGGATGTTGG TCTGTGGGAA GACGGC CA 

CGGGTCTCAT TGTCACAGAA GTCAACGCAG TTTCACCAGA G GGACGA ATCAGTCCTG AGGATGCAGG CATCTACGAT GATGGG CA 

CGGGACTGTC CATGGTAGAG GCCACCGCTG TTCAAAACCA C GGTCGC ATCACGCCTC AGGACGTTGG TCTCTGGGAA GATGGA CA 

CGGGACTGTC CATGGTAGAG GCCACCGCTG TTCAAAACCA C GGTCGC ATCACGCCTC AGGACGTTGG TCTCTGGGAA GATGGA CA 

CCGGACTCAC CTGCGTTGAA GCCACAGCCG TGACTCCTCA A- --GGTCGC ATCACGCCTG AAGACGTCGG TATCTGGCAA GATTCT CA 

GGGGCCAGAT CCAGACGGGC AACGTCATGA TCGACCCGGA GCACCTCGAG GCCCCGGGCA ACATGGTGGT GCCGCGCGAC GCCGAGCCCT CGGGCGAGCG 

CTGGCCTCAT CATGCTAGAA GCTACCGCAG TTCAAGCACG T GGCCGT ATCACACCTG AAGATTCTGG CATCTGGCTA GACTCT CA 

TGGGAAACGT CATGGTCGAA GCATCTGGTG TTGAGCCAGA G GGGAGG ATCACGCCTC AGGACCTGGG TATTTGGTCG GAACAG CA 



SEQ 1 
SEQ 2 
SEQ 4 
SEQ 5 
SEQ 7 
SEQ 9 
SEQ 11 
SEQ 13 
SEQ 15 
SEQ 17 
SEQ 18 
SEQ 20 
SEQ 21 
SEQ 2 3 
SEQ 25 
SEQ 26 
SEQ 28 
SEQ 2 9 
SEQ 32 
SEQ 34 
SEQ 36 
SEQ 37 
SEQ 3 9 
SEQ 41 
SEQ 4 3 
SEQ 82 
SEQ 84 



GATCGCCCCG ATGCGCC GGGTCATCGA CTTCGTGCAC AGCCAGGGC 

GATCGCCCCG ATGCGCC GGGTCATCGA CTTCGTGCAC AGCCAGGGC 

GATTGAGCCA TTGAGCC GCGTGATCGA GTTTGTCCAC AGTCAGAAC 

GATTGAGCCA TTGAGCC GCGTGATCGA GTTTGTCCAC AGTCAGAAC 

ATTCCTGGGG CTGAAGC GGGTCGTCGA GTTCATGCAC GCACAGGGC 

GATTGCGCCC ATGAAGC GCGTCATCGA CTTCGTGCAC TCGCAGTCC 

AGCAGAGAAA TTGAAAC CAATTGTCGA TTACGCTCAT TCTCAAAAG 

AGCTCACAGT TTACGGA AAATTGTTGA TTTTATTCAT GATCAAGAC 

GATTGCGCCC CTCAAGC GCATCGTCGA CTACATCCAC TCCCAGGGC- CAGAAGGCCG GTATC 

GACCGTCACG GCCTTCAAGG CCTGGGCGGA CGCCGCGCGC CTGAATGGC- CAGTCCAAAA 
GACCGTCACG GCCTTCAAGG CCTGGGCGGA CGCCGCGCGC CTGAATGGC- 
GATCGCTCCT ---CTGCGCC GCATCGTCGA CTACGTGCAC AGCCAGGGC- 
GATCGCTCCT ---CTGCGCC GCATCGTCGA CTACGTGCAC AGCCAGGGC- 

AATGAAGCCG TTACGAA GAATTGTTGA ATTTGCTCAT TCGCAAAAT 

GATTGCGCCG CTGAAGC GCATCGTCGA CTTTATCCAC TCGCAGAAC ■ 

GATTGCGCCG ---CTGAAGC GCATCGTCGA CTTTATCCAC TCGCAGAAC- 

GATCGAGCCT CTGAAGC GCATCACCAC TTTCGCGCAC AGTCAGAGC • 

GATCGAGCCT CTGAAGC GCATCACCAC TTTCGCGCAC AGTCAGAGC- 



CAGAAGATCG GCGTG- 
CAGAAGATCG GCGTG- 
CAGCTTATCG GCGTG- 
CAGCTTATCG GCGTG- 
GCCAAGGTCG GGATC- 
CAGAAGATTG GCGTG- 
CAATTAATTG CCATC- 
GGAATTTGCT GTAT A - 



CAGTCCAAAA 
CAAAAGATCG CCATC- 
CAAAAGATCG CCATC- 
CAAAAAATTG GGATT- 
CAGGTCGCGG CCATC- 
CAGGTCGCGG CCATC- 
CAGAAAATTG GTATC - 
CAGAAAATTG GTATC - 



GCTTGGACCT CTCCGGG ATATTGTGGA CTTTGTACAC AGCCAGGGC- GCCAAGATTG CTATT- 

AATCGAGCCC T- -TTGAAGC GCATCACTAC TTTTGCCCAC AGCCAAAGCW CAGAAGATTG GTAT- - 

AATCGAGCCC TTGAAGC GCATCACTAC TTTT GCCCAC AGCCAAAGC - CAGAAGATTG GTAT- - 

GATCGAGCCT C--TTGCCAA GGTCGTC - GA GTTTGCCCAC TCCCAGAAC - CAGAAGATCA TGATT- 

CTTCGACATG TTTTCCAAGC TCGCCGCCGC CGCCAAGGAG CACGGCAGC - CTC-ATCGTC GCG 

TGTTGAGGGA CTGCGAA AGCACGTCGA GTTTGCCCAT GCCAACAAC- TCTCTTATCG GTATC - 

TCGGGATGCA CACAAGG CGCTGGTGTC GGTGCTCAAG TCCTTCACG - GATGGTCTGG GTGTA- 



CAGCTT GCCCATGCCG GCCGGAAAGC 

CAGCTT GCCCATGCCG GCCGGAAAGC 

CAGATC GCACACGCAG GTCGCAAGGC 

CAGATC GCACACGCAG GTCGCAAGGC 

CAGCTT GCGCATGCGG GCCGGAAAGC 

CAGATT GCCCACGCCG GCCGCAAGGC 

CAATTG GGCCATGGTG GTAGAAAAGC 

CAATTG AATCACGCTG GGCGAAAGAT 

CAGCTT GCCCACGCCG GCCGCAAGGC 

CGTGCAGATC AACCACCCTG GTCGCCAGAG 
CGTGCAGATC AACCACCCTG GTCGCCAGAG 

CAACTG GCTCATGCCG GCCGCAAGGC 

CAACTG GCTCATGCCG GCCGCAAGGC 

CAATTG GCGCATGCTG GTAGAAAGGC 

CAGCTC GCCCACGCCG GTCGCAAGGC 

CAGCTC GCCCACGCCG GTCGCAAGGC 

CAGCTG TCGCATGCGG GTCGCAAGGC 

CAGCTG TCGCATGCGG GTCGCAAGGC 

CAGATA GGTCATGCTG GGAGAAAAGC 

TCAGCTC TCGCACGCTG GTCGTAAGGC 

TCAGCTC TCGCACGCTG GTCGTAAGGC 

CAGTTG GCGCATGCGG GCCGGAAAGC 

CAGGTC GGACACCCCG GTCGCCAGGC 

CAGATT GGCCATGCTG GTCGCAAGGC 

GGGCTG CAACTGGCGC ATGCGGGAAG 
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SEQ 1 CACCACCGTT GCGCCCTGGA TCTCA -TTCTCGGCC ATCGCGACGG AGAAGGTCGG CGGATGGCCG 

SEQ 2 CACCACCGTT GCGCCCTGGA TCTCA -TTCTCGGCC ATCGCGACGG AGAAGGTCGG CGGATGGCCG 

SEQ 4 CAGCACCGTC GCGCCATGGC TCTCG - - GCCAACGAT ACCGCCTCCG AGAAGATGGG CGGCTGGCCA 

SEQ 5 CAGCACCGTC GCGCCATGGC TCTCG -GCCAACGAT ACCGCCTCCG AGAAGATGGG CGGCTGGCCA 

SEQ 7 GAGTGCCGTT GCGCCGTGGC TGGCG GCGC AGGCGGGCAA GTCGAGTCTG AAGGCGGATG AGAGCGTTGG CGGGTGGCCC 

SEQ 9 TTCGAACATC GCCCCCTGGC TCATG AA CAAGGGCATC GTCGCGACGG AGAAGGTCGG TGGCTGGCCG 

SEQ 11 TTCTGGTCAG CCCTTATTTT TGCAC -TTGGAACAA GTTGCAGATA AATCTGTCAA TGGGTTTGCC 

SEQ 13 TGTTGAAGGG GTACCATTCC AACAA -- - -ATACAACA TGGTTGGCAA 

SEQ 15 CTCCACCAAG GCCCCCTGGC ACTAC CAGCGCGG CAAGAGCGAG CTTGCCGGCC CCGAGCAGGG TGGCTGGCCC 

SEQ 17 TCCGATGGGC GCGGGCACGC GGGGA CTGT GGGAGAAGGC GGTGGCGCCC TCGCCGGTGC CGTTGGTGTT GGGAGAGGCG 

SEQ 18 TCCGATGGGC GCGGGCACGC GGGGA CTGT GGGAGAAGGC GGTGGCGCCC TCGCCGGTGC CGTTGGTGTT GGGAGAGGCG 

SEQ 20 CAGCACAAAG GCCCCCTGGC ACGACTCCTT CACCCCCAGC GGCGAGTATA AGCCGAGAGA GGGCTTACAG GTCGTCGGAC CCGAGTATGG CGGCTGGCCT 

SEQ 21 CAGCACAAAG GCCCCCTGGC ACGACTCCTT CACCCCCAGC GGCGAGTATA AGCCGAGAGA GGGCTTACAG GTCGTCGGAC CCGAGTATGG CGGCTGGCCT 

SEQ 23 TAGCACCACT GCTCCTTATC GAGGA - TACACA GTTGCGACTG AAGCTCAAGG TGGGTGGGAG 

SEQ 25 TAGCACCCTG GCACCGTGGA TCACC- - - GAGGCTCG CGGCAAGGCG CTGGCTCAGG AGAGCGAGAA CGGCTGGCCC 

SEQ 2 6 TAGCACCCTG GCACCGTGGA TCACC --GAGGCTCG CGGCAAGGCG CTGGCTCAGG AGAGCGAGAA CGGCTGGCCC 

SEQ 28 CAGTTGCGTA TCTCCCTGGC TAAGC - GTAAATGCT GTCGCGGCGG AAGAAGTGGG TGGCTGGCCA 

SEQ 2 9 CAGTTGCGTA TCTCCCTGGC TAAGC -GTAAATGCT GTCGCGGCGG AAGAAGTGGG TGGCTGGCCA 

SEQ 32 

SEQ 34 GAGCACAGTC GTACCGTGGC TGGAC -CGCAAGAAC ACTGCTTTTA 

SEQ 3 6 TAGTTGTGTA TCTCCGTGGT TGAGC -ATCAACGCT GTTGCCGCTA AGGAAGTCGG TGGCTGGCCA 

SEQ 3 7 TAGTTGTGTA TCTCCGTGGT TGAGC -ATCAACGCT GTTGCCGCTA AGGAAGTCGG TGGCTGGCCA 

SEQ 39 GAGCACTGTG GCACCATGGT TAAGC -GGCGGCGAT GTTGCTGGTG AGGACGTCAA CGGATGGCCA 

SEQ 41 - GACT GCCGAGTAAA CGCGCCGGCA AGGAGGCGGG AGGATGGCCG 

SEQ 4 3 CCGCGGCAGC GTCCAGCAGC ACCCC ATTAGCGC CAGCGACGTG CAGCTTAAGC AGGAGATG - 

SEQ 82 CTCCTGCGTT GCTCCTTGGT TAGAC ' -GCCGGACTT GCCGCTGAAA AGGCCGCTGG TGGATGGCCC 

SEQ 84 GAAGGCCTCG GACTGGTCAC CTTTC TACC GCGGAGAAAA GAAGCAAAAG TTTGTGACGC AGGAGGAAGG ' 



SEQ 1 GACCCGCGTC AAAGGGCCCG GCGATATC- - -CCCTTTGCG GAGCCCTTCG CCAAGCCCAA GGCCATGACG 

SEQ 2 GAC-CGCGTC AAAGGGCCCG GCGATATC-- -CCCTTTGCG GAGCCCTTCG CCAAGCCCAA GGCCATGACG 

SEQ 4 GGC-CGCGTC AAAGGCCCGA CAAATGTG- - -CCCTTCACC GTTAAGAACC CTGTGCCGAA GGAGATGACC 

SEQ 5 GGC-CGCGTC AAAGGCCCGA CAAATGTG- - -- CCCTTCACC GTTAAGAACC CTGTGCCGAA GGAGATGACC 

SEQ 7 GCG - GATGTG GTGGGTCCGT CGGGCGGG- -GAGGAGC ATATCTTTAG TCCCGAGGAG GATGCGTATT GGGTGCCGCG GGCGCTGAGC 

SEQ 9 GAT - CGTGTG ATCGGCCCGT CCACCGTG- - -CCCTTCCAC GAGACTTTCC CCACCCCCAA GGCCATGACC 

SEQ 11 GAC-AAAGCA GTTGCTCCTT CTGCATTG- - GCATTC- -AGACCAAAT GGTAATTTAC CTGTTCCTAA TGAGTTGACC 

SEQ 13 GAA-CATTGT GTGGGGCCAT CTACTGAG- - CCATTTAGT GATTCACACA ATACACCACG AGAATTGACT 

SEQ 15 GAG - AACGTC TGGGCCCCCA GCGC CATC- - AG CTACAACGAG GAGACCTTCC CCTTCCCCAA GGAGATGACC 

SEQ 17 TTT - GTGCCT CGCTTGTTGT CGAAAGTG CTTTTCG GCACGCCGCG GGAGCTGACG 

SEQ 18 TTT -GTGCCT CGCTTGTTGT CGAAAGTG-- CTTTTCG GCACGCCGCG GGAGCTGACG 

SEQ 2 0 GAT - GACGTC TGGGCCCCGA GCGCCATC - - - CCGTTCTCG GAGGACTTTC CGAACCCCAA GGAGATGACC 

SEQ 21 GAT -GACGTC TGGGCCCCGA GCGCCATC-- -CCGTTCTCG GAGGACTTTC CGAACCCCAA GGAGATGACC 

SEQ 23 AAT - GATGTT TATGGACCAA ATGAAGAC - - -AGGTGGGAC GAAAACCACG CTCAACCTCA TAAGTTAACT 

SEQ 25 GAC - GACGTT GTGGCTCCCA GCGCGATT- - - CCTTACACC AAGGACTGGG CCACACCGCG TGAGTTGACT 

SEQ 26 GAC- GACGTT GTGGCTCCCA GCGCGATT-- -CCTTACACC AAGGACTGGG CCACACCGCG TGAGTTGACT 

SEQ 28 GAC - AATATC GTTGCTCCCT CGGCCATC- GC ACAAGAAAAT GGTGTGAACC CAGTTCCCAA GGCTTTCACG 

SEQ 2 9 GAC -AATATC GTTGCTCCCT CGGCCATC-- GC ACAAGAAAAT GGTGTGAACC CAGTTCCCAA GGCTTTCACG 

SEQ 32 

SEQ 34 - 

SEQ 36 GAC - AACATT GTTGCTCCTT CTGCCATC- GC ACAAGAAGCT GGCGTGAACC CTGTTCCCAA GGCCTTCACC 

SEQ 37 GAC -AACATT GTTGCTCCTT CTGCCATC-- GC ACAAGAAGCT GGCGTGAACC CTGTTCCCAA GGCCTTCACC 

SEQ 39 CAG-GATGTC TGGGCGCCCA GTGCGATT - CCATGGAAC GAGAAGCACG CTGTCCCAAA GGAGATGTCG 

SEQ 41 GAG -GATGTT GTGGGTCCGT CGGGTGGGGA GGACTTTACG TGGGATGAGA GGTCCTCGAG CGACCCTAGT GGAGGCTACT ATGCGCCGAG AGAGTTGTCG 

SEQ 43 - TTTGGG TCAAAGTTTG GCGTGCCCAG GCCCGCTACC 

SEQ 82 GAT -GACGTT GTCGGACCTA GCAACGAG- - -CCTTTTGCT CCTGGCTACC CTACCCCCCG TGCTATTACT 

SEQ 84 GAT - CGTGTC GTCGCTCCTT CGGCCATC-- -GCATATGCG CAAGGTCACG TTACCCCTCG AGCTCTCACG 

1101 1111 1121 1131 1141 1151 1161 1171 1181 1191 



SEQ 1 CTGGATGA-G ATCGAGCAGT TCAAGAAGGA CTGGGTGGCG GCCACGAAGC GCGCCATCGC CG CCGGT GCGGACTTTG TCGAGATTCA CAATGCGCAT 

SEQ 2 CTGGATGA-G ATCGAGCAGT TCAAGAAGGA CTGGGTGGCG GCCACGAAGC GCGCCATCGC CG CCGGT GCGGACTTTG TCGAGATTCA CAATGCGCAT 

SEQ 4 AAGCAGGA-T ATCGAGGATC TGAAGACCGC CTGGGTGGCC GCTGTCAAAC GGGCTGTTAA GG CCGGA GCCGACTTTA TCGAGATCCA CAATGCGCAT 

SEQ 5 AAGCAGGA-T ATCGAGGATC TGAAGACCGC CTGGGTGGCC GCTGTCAAAC GGGCTGTTAA GG CCGGA GCCGACTTTA TCGAGATCCA CAATGCGCAT 

SEQ 7 ACGGCCGA-G GTCCGTCAGG TGGTGGCGGC GTTTGCGAAG AGCGCGCGGC TAGCGGTGCA GG CTGGG GTGGATGTTA TCGAGATCCA TGGGGCGCAT 

SEQ 9 AAGGACGA-C ATCGAGCAGT TCAAGCGCGA CTGGTTTGAT GCGTGCAAGC GGGCCATTGC CG CTGGC GCGGACTTCA TCGAGATCCA CAATGCCCAC 

SEQ 11 AAAGATGA- A ATCAAACGTG TTGTTAAGGA TTTTGGTGCT GCTGCTAGAA GAGCTGTTGA AATCAGTGGC TTTGATGCAG TTGAGATTCA TGGTGCTCAT 

SEQ 13 GTTAATGA-A ATAAATTCAA TTGTGGAAGA CTTTGCCAAT GCAGCTTGGC GGGCTGTGGA AATCTCAAAA TTCGATGCCA TTGAAATACA TTGTGCTAAT 

SEQ 15 GTCGAGCA-G ATCCACGAGC TCGTCGAGGC CTGGAAGGCG TCTGCCCAGC GTGCCCTCAA GG CCGGC TTCGACCTCA TTGAGATCCA CGCCGCCCAC 

SEQ 17 GTTGCGGA - G ATCAAGGATA TCGTGCAAAA GTTTGCGGTG ACGGCGAGGA TCACGGCCGA GG CCGGG TTCAATGGCG TCGAGATCCA TGCGGCGCAT 

SEQ 18 GTTGCGGA -G ATCAAGGATA TCGTGCAAAA GTTTGCGGTG ACGGCGAGGA TCACGGCCGA GG- - -CCGGG TTCAATGGCG TGGAGATCCA TGCGGCGCAT 

SEQ 20 GTTGAGGA-G ATTGAGGGAC TCGTCACCAG CTTTGTGGAC GCTGCCAAGC GTGCCATCGA GG--- CCGGC GTCGACATTA TTGAGATTCA CGGCGCTCAC 

SEQ 21 GTTGAGGA-G ATTGAGGGAC TCGTCACCAG CTTTGTGGAC GCTGCCAAGC GTGCCATCGA GG CCGGC GTCGACATTA TTGAGATTCA CGGCGCTCAC 

SEQ 2 3 GAAAAGCA-A TATGATGAAT TAGTGGATAA GTTTGTTGTT GCTGCGAAGC GTGCAGTTGA AA TAGGT TTTGATGTAA TTGAAATTCA TGGCGCTCAT 

SEQ 2S ACCGAGGRRG TCGAGGGTCT GGGTGAAGAA GTTCGCCGAG TCGGCCAAGA GGTCAAATCG A GCTGGT TTTGACGTCA TTGAGATCCA CGCCGCTCA- 

SEQ 26 ACCGAGGR-G TCGAGGGTCT GGGTGAAGAA GTTCGCCGAG TCGGCCAAGA GGTCAAATCG AG CTGGT TTTGACGTCA TTGAGATCCA CGCCGCT 

SEQ 2 8 AAGGAGGA-T ATAGAGCAAC TCAAGAGCGA CTACGTGGAA GCGGCAAAAC GAGCCATCCA TG CTGGT TTCGATGTTA TCGAAATTCA TGCAGCTCAT 

SEQ 2 9 AAGGAGGA-T ATAGAGCAAC TCAAGAGCGA CTACGTGGAA GCGGCAAAAC GAGCCATCCA TG CTGGT TTCGATGTTA TCGAAATTCA TGCAGCTCAT 

SEQ 32 -- - 

SEQ 34 

SEQ 36 AAGGAGGA-T ATCGAGGAAC TCAAGAATGA CTTTCTGGCT GCAGCMAAAC GAGCCAWCCG CGC TGGT TTTGATGTCA TCGAGATCCA TGCAGCTCAT 

SEQ 37 AAGGAGGA-T ATCGAGGAAC TCAAGAATGA CTTTCTGGCT GCAGCMAAAC GAGCCAWCCG CGC TGGT TTTGATGTCA TCGAGATCCA TGCAGCTCAT 

SEQ 3 9 TTGGATGA-T ATCGAGGCTT TCAAGAAGGC GTTTGGAGAG GCGGTCAAGC GGGCATTGAA GGC TGGA TTTGATGTTA TTGAGATTCA CAATGCTCAC 

SEQ 41 GTCAGAGA-G ATCAAGGAGA TGGTCCAAGA CTGGGCGACA GCAGCGAAAA GGGCGGTGAA AGC GGGC GTGGATGTAA TCGAAATCCA CGGCGCGCAT 

SEQ 4 3 AAGGAGGA-T ATTAAGGCGG TGATTGAGGG TTTTGCCCAC ACGGCCGAGT ACCTTGAAAA GGC CGGT TTCGACGGTA TCGAATTGCA CGCCGCCCAC 

SEQ 82 CTTGAAGA - G ATTGAACAGT TGAAGGAGGA CTTTGTTTCC GGTGTTCGTC GAGCGGTTGA AG CAGGA TTTGACACTA TCGACTTCCA TTTCGCTCAC 

SEQ 84 ACCGAGGA-C ATCAACAAGT TGCAAGACAA ATTCGTTCAG TCGGCACGAT GGGCGTTTGA AG CTGGG TATGACTACG TCGAACTTCA CAGCGCTCAC 



App No.: NYA 

Docket No.: HO-P03371US0 
Inventor: Sandra E. Lavens et al. 
Title: 2031 OXIDOREDUCTASE 



Expr Mail #: EV678185082U! ! 

I 



.9/17 



SEQ 1 
SEQ 2 
SEQ 4 
SEQ 5 
SEQ 7 
SEQ 9 
SEQ 11 
SEQ 13 
SEQ IS 
SEQ 17 
SEQ 18 
SEQ 20 
SEQ 21 
SEQ 2 3 
SEQ 25 
SEQ 26 
SEQ 28 
SEQ 29 
SEQ 32 
SEQ 34 
SEQ 36 
SEQ 37 
SEQ 3 9 
SEQ 41 
SEQ 4 3 
SEQ 82 
SEQ 84 



GGATACCTGC TGTCGTCATT CCTCTCGCCG GCCGCCAAC- 

GGATACCTGC TGTCGTCATT CCTCTCGCCG GCCGCCAAC- --- -- -- 

GGCTATCTTC TGATGTCGTT CCTCTCCCCT GCGGTCAAC 

GGCTATCTTC TGATGTCGTT CCTCTCCCCT GCGGTCAAC- 

GGCTATCTCA TCAACGAGTT CCTGAGCCCG GTCACGAAT -- - 

GGGTATCTTC TCTCGTCTTT CCTATCACCG TCTTCCAAC- 

GGTTATTTGA TTAATGAGTT CTATAGTCCT ATTTCAAAC- 

GGATGTTTAA TACACCAATT TTTAAGTAAA TTGACAAAC- 

GGCTACCTCA TTTCCGAGTT CTTGAGCCCC ATCTCCAAC - 

GGATACCTGT TGGCGCAGTT CTTGAGCAAG AAGACAAAC - -- 

GGATACCTGT TGGCGCAGTT CTTGAGCAAG AAGACAAAC- - 

GGTTACCTGA TCACCGAGTT CCTTTCGCCG CTATCAAACG TAAGTGGAGA TACTTTGTGT GGGGCTGTGC GCATACTCCC 

GGTTACCTGA TCACCGAGTT C CTTT CGCCG CTATCAAAC - 

GGTTATCTTA TATCGTCAAC AGTTAGTCCT GCCACTAAT - 



CTTCTATTAA 



GGATATCTAC TGCATCAATT CTTGAGTCCG GTAAGCAAT - 
GGATATCTAC TGCATCAATT CTTGAGTCCG GTAAGCAAT - 



GGATACKTGC TTCACCAGTT CTTGAGTCCA GTCAGTAAC- 
GGATACKTGC TTCACCAGTT CTTGAGTCCA GTCAGTAAC- 
GGATACCTCC TCCACGAATT CATCTGCCTG AGAGCAACA- 
GGGTACCTCA TCCACGAATT CCTCTCACCC ATTACCAAC- 
GGTTACCTGC TGGCCCAATT CCTGTCCGAA ACAACCAAC- 
GGTTATCTTG TTTCCAGCTT CCTGTCCCCT GCCACCAAC- 
GGATACCTGA TGCACTCGTT CCTCAGCCCG TTGACCAAT- 



1301 



1311 



1321 



1331 



SEQ 1 
SEQ 2 
SEQ 4 
SEQ 5 
SEQ 7 
SEQ 9 
SEQ 11 
SEQ 13 
SEQ 15 
SEQ 17 
SEQ 18 
SEQ 2 0 
SEQ 21 
SEQ 2 3 
SEQ 25 
SEQ 2 6 
SEQ 2 8 
SEQ 2 9 
SEQ 32 
SEQ 34 
SEQ 36 
SEQ 37 
SEQ 3 9 
SEQ 41 
SEQ 4 3 
SEQ 82 
SEQ 84 



AACCGCAC GGACCAGTAC GGCGGGTCGT TCGAGAACCG CATCCGGCTG TCTCTCGAGA TTGCGCAGTT GACTCGGGAC 

__ AACCGCAC GGACCAGTAC GGCGGGTCGT TCGAGAACCG CATCCGGCTG TCTCTCGAGA TTGCGCAGTT GACTCGGGAC 

- -ACGAGAAC AGACGAGTAC GGAGGCAGTT TTGAGAATCG CATCCGGCTC AGTCTGGAGA TCGCCAAGCT CACCCGCGAA 

--ACGAGAAC AGACGAGTAC GGAGGCAGTT TTGAGAATCG *CATCCGGCTC AGTCTGGAGA TCGCCAAGCT CACCCGCGAA 

-- - - AAGCGGAC GGATGCGTAC GGCGGGAGCT TTGAGAACCG GACCCGGATC GTGCGCGAGG TTGCGGCGGC TATTCGTGCG 

- -ACGCGCAC CGACGAGTAC GGCGGCTCCT TTGAGAACCG CATCCGGCTC TCTCTCGAAA TCGCCCAGGT CACCCGTGAC 

- AAGAGAAC AGATGAATAC GGTGGCAGTT TTGAAAATAG AACCAGATTT TTAAAGGAAG TTATCGATAG TGTTAAATCA 

AAGAGAGC TGACCAATAC GGGGGCTCAT TTGAAAACAG AGTTAGATTT CTTTTACAAA TAATTGAGAA TATAAAACGA 

- - CAGCGTAC CGACCAGTAC GGTGGCTCCT TCGAGAACCG CACCCGCGTT CTCCGCGAGA TCATCTCGGC CGTCCGCTCC 

- AGGCGCGG GGATGAGTAT GGCGGGTCGG CTGAGAACAG GGCGAGGATT GTTGGGGAGA TTATTAAGGA GTGCAGGAGG 

--AGGCGCGG GGATGAGTAT GGCGGGTCGG CTGAGAACAG GGCGAGGATT GTTGGGGAGA TTATTAAGGA GTGCAGGAGG 

CATTTTATTT CCTGGCACGC AGAAACGGAC AGACAAGTAC GGCGGCAGCT TTGAGAACCG CACCCGGGTC CTGATCGATA TTATCAAGGC CGTCCGGGCA 

-- AAACGGAC AGACAAGTAC GGCGGCAGCT TTGAGAACCG CACCCGGGTC CTGATCGATA TTATCAAGGC CGTCCGGGCA 

-- - - GACCGCAA TGACAAGTAT GGTGGGACAT TTGAGAAACG TATTTTGTTT CCTATGGAAG TTGTCCATTC TGTTCGTAAA 



-CAAAGAAC CGACGAGTAT GG 

-CAAAGAAC CGACGAGTAT GG 

AAC CGACGAGTAT GGTGGCAGTT TCGAGAACCG TATCAGAGTT GTCTTGGAAA TCCTTGACCT CATCCGCGCT 



-CAAAGAAC CGATGAGTAT GGTGGCAGCT TCGAGAACCG TATCAGAGTC GTCTTGGAGA TCATTG 

-CAAAGAAC CGATGAGTAT GGTGGCAGCT TCGAGAACCG TATCAGAGTC GTCTTGGAGA TCATTG 

-CCAGGACC GACAAGTACG GGCGGAAGCT GGGAAAACCG CACTCGTCTG ACAATGGAAA GTCGTCGACC TTGTCCGCAG 
-CGCCGGAC AGATTCTTAC GGCGGTTCTT TCGAAAACCG TACCCGTCTA CTCATTGAAA TCGTAACAGC CGTCCGAGCC 
-CAGCGCAC CGACGAGTAC GGCGGCAGCC TCGAAAACCG CATGCGGCTA ATCCTCGAGG TCACGGCCGA GGTCCGCAGG 
-AAGCGTAC CGACAAGTAC GGAGGTAGCT TCGAGAACAG AGTGCGCCTT GCTCTCGAGA TTGTCGAGGC TGCACGAGCT 
-CAGCGTAC CGACGAGTAC GGCGGTAGCC TGGAGAACCG CGCTCGATTT CTGCTCAACG TTGCCCGTCG AATCCGCCAA 



14 31 



1441 



1451 



1461 



1471 



1481 



1491 



SEQ 1 

SEQ 2 

SEQ 4 

SEQ 5 

SEQ 7 

SEQ 9 

SEQ 11 

SEQ 13 

SEQ 15 

SEQ 17 

SEQ 18 

SEQ 2 0 

SEQ 21 

SEQ 23 

SEQ 25 

SEQ 2 6 

SEQ 28 

SEQ 2 9 

SEQ 32 

SEQ 34 

SEQ 36 

SEQ 37 

SEQ 39 

SEQ 41 

SEQ 4 3 

SEQ 82 

SEQ 84 



GCCGTCGGCC CTCATGTGCC C GTTTT CCTGCGCATT TCGGCCTCGG ACTGGTGCGA GGAGACCCTG CCGGA 

GCCGTCGGCC CTCATGTGCC C GTTTT CCTGCGCATT TCGGCCTCGG ACTGGTGCGA GGAGACCCTG CCGGA 

AATGTGCCCA AGGATATGCC T GTCTT CCTGCGGGTC TCCGCCACCG ATTGGCTGGA GGAGGTGCAG CCGAA 

AATGTGCCCA AGGATATGCC T - - -GTCTT CCTGCGGGTC TCCGCCACCG ATTGGCTGGA GGAGGTGCAG CCGAA 

GTGATTCCCG AGGGGATGCC C CTGTT TCTGCGTATC AGCGCCACGG AGTGGTTGGA GGGTCAGCCG GTGGC 

GCCGTCGGCC CCAACGTTCC T G TTTT TCTCCGTGTC TCCGCGACGG ACTGGATCGA GGAGACCCTC CCCGA 

AGTATTCCAA ACGATGTTCC A GTGTT TTTGAGAATC TCTGCTGCTG AAAATAGTCC TGATCCA 

AAGATAGAAA CA CC G - - - ATTTT CTTAAAGTTT CCAATGTCAG ATAATTGTAG TGATCCG 

GTCATCCCCG AGGACATGCC C CTCTT CGTCCGTGTC TCCGCCACCG AGTGGATGGA GTACACC 

CAGGTGACTG AGGCGGTGGG TGAAGAGGAG GCGAAGAAGT TTGTGGTGGG AATCAAGCTG AACAGTGCGG ATTGGCAGGC GGGACGCGAT GGA 

CAGGTGACTG AGGCGGTGGG TGAAGAGGAG GCGAAGAAGT TTGTGGTGGG AATCAAGCTG AACAGTGCGG ATTGGCAGGC GGGACGCGAT GGAAAG 

GTGATTCCCG AGGAGATGCC A CTCTT CGTCCGAATC TCCGCGACCG AATGGATGGA GTACGCCGGC 

GTGATTCCCG AGGAGATGCC A CTCTT CGTCCGAATC TCCGCGACCG AATGGATGGA GTACGCCGGC 

GCAATTCCAG ATAGTATGCC C TTGTT TTATAGAGTA ACGGCTACAG ATTGGTTGCC 



GCCATCCCCG AAACTACACC T GTCCT CGTTCGTGTC AGTGCAACTG ATTGGTTCGA GTTTGACTCT CAATTCAAAG 



CATT -- 

GCGATGCCCT CCAGCATGCC T- 
CGGACGAGCA AGAATTTCAT C- 
GTTATGCCTG AGGACATGCC C- 
GAATTCCCCA ACAAGGGT- - -- 



-CTCTT CCTCCGCCTC TCCTCTACAG AATGGATGGA AGATACCGAC ATCGGC- 

- CTCGG CATCAAAATT AACAGCGTCG AGTTCCAGGA GAAG 

-TTGTT CACTCGCATC AGTGGAACTG ACTGGCTGGA GAACAACCCT GAG 

-CTCTG GGTGCGCGTC AGCTCCACCG ACTGGGCCGA CCAAGCGCAC CAA 
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1501 1511 1521 1531 1541 1551 1561 1571 1581 1591 



- ****** ********** *****88888 

SEQ 1 GCAGAGCTGG AAGTCGGAGG ATACCGTGCG GTTCGCGCAG GAGCTGGTCA AGCAGGGCGC CGTTGATCTG ATCGATATCA GCAGCGGTGG 

SEQ2 GCAGAGCTGG AAGTCGGAGG ATACCGTGCG GTTCGCGCAG GAGCTGGTCA AGCAGGGCGC CGTTGATCTG ATCGATATCA GCAGCGGTGG 

SEQ 4 CAA GCCCAGCTGG CGAGGCGTGG ACACTGTCCG ATTTGCGAAG ATCCTGGCAG AAACGGGTTA CGTTGACGTG CTTGACGTGA GCAGTGGCGG 

SEQ 5 CAA GCCCAGCTGG CGAGGCGTGG ACACTGTCCG ATTTGCGAAG ATCCTGGCAG AAACGGGTTA CGTTGACGTG CTTGACGTGA GCAGTGGCGG 

SEQ 7 -CGCGGAGTC GGGCAGCTGG GATAT GC AGAGCTCGCT GGAGCTGGTC AAGAAGCTGC CCGAATGGGG CATTGACCTG GTGGATGTCA GCTCCGCCGC 

SEQ 9 GGAATCGTGG AAGCTCTCTG ACTCCGTCCG CTTCGCCGAA GCCCTCGCTG CCCAGGGCGC TATTGACCTG ATCGACGTCT CTTCCGGCGG 

SEQ 11 -GAAGCTTGG ACTATTGAAG ATTCCAAAA- - -AATTAGCT GACATTTTAG TAGAAAAGGG TATTGCTTTG GTTGATGTTT CATCTGGTGG 

SEQ 13 - GAAGCGTGG TCTACGGAAG ATGCATTGA- - -AGTTGGCC GATCTTGTTA TTGATTTAGG AGTAAAGGTG ATCGACGTTA CATCAGGTGG 

SEQ 15 GGCCA GCCCTCGTGG GACCTCCAGC AGACCATTG - - -AGCTCGCC AAGATCCTCC CCGACCTCGG CGTCGACCTC CTCGACGTCT CTTCCGGCGG 

SEQ 17 AGGAGGAGGA GGAGACGGAT ACGGCGGAGG AGGTGTTGA- - -AGCAGATT GAGCTTTTTG AGCAGTGGGG GATCGACTTT GTCGAGGTTA GCGGTGGCAG 

SEQ 18 - - GAGGAGGA GGAGACGGAT ACGGCGGAGG AGGTGTTGA- --AGCAGATT GAGCTTTTTG AGCAGTGGGG GATCGACTTT GTCGAGGTTA GCGGTGGCAG 

SEQ 2 0 -- -GA GCCTAGCTGG GACCTCGAGC AGAGCACAC- - - AGCTTGCC AAGCTCCTCC CGGACCTGGG TGTCGACCTG CTCGACGTCA GCTCGGGCGG 

SEQ 21 GA GCCTAGCTGG GACCTCGAGC AGAGCACAC- --AGCTTGCC AAGCTCCTCC CGGACCTGGG TGTCGACCTG CTCGACGTCA GCTCGGGCGG 

SEQ 23 GGATGG GAGATAGAAG ATACAGTTG- - - CATTAGCA GCGAGGCTTC GCGATGGTGG TGTTGACTTG ATAGATGTTA GCTCTGGTGG 

SEQ 25 - -- - 

SEQ 26 

SEQ 28 - - 

SEQ 29 

SEQ 32 ACGAGTTTCC TGAAAGCTGG ACAGTCGAGC AGACTT G TCAACTCGCG CGTATCTTGC CCAAGCATGG AGTAGACTTG GTGGACGTCA GCTCAGGCGG 

SEQ 34 

SEQ 36 

SEQ 37 

SEQ 39 - - 

SEQ 41 - -AAGAAGTT CGGAAGCTGG GATGTCGAAA GCACGATCA - - - AGATCTCC AAAATCCTGG CCGACTTGGG CGTTGATCTC CTCGACGTGT CTTCCGGTGG 

SEQ 43 -GGTTTCAAG CCA GAGG AGGCGGTGC- - - AGTTGTGC GAGGCCCTCG AGGCCGCGGG CATGGATTTT GTCGAGACGA GCGGCGGCAC 

SEQ 82 - - TACGAGGG AGAGACCTGG ACTCTTGAGC AGAGCATCA - - -AGCTTGCA CACCAGTTAG CAGACCGTGG TGTCGATGTT TTGGATGTTT CCAGTGGTGG 

SEQ 84 GC CGACTCTTGG ACCGTTGACC AGACGGTTG- - -AACTCGCC AAGATGCTCC AAGAGGCTCG AGTCGACCTG CTAGACGTCA G CTTCCGGCGG 

1601 1611 1621 1631 1641 1651 1661 1671 1681 1691 



8888888888 888888 

SEQ 1 TGTTCTCGCG CAG - 

SEQ 2 TGTTCTCGCG CAG 

SEQ 4 CACTCATTCG GAG 

SEQ 5 CACTCATTCG GAG 

SEQ 7 GAACCACAAG GAC 

SEQ 9 TGTCCACGCC GCG 

SEQ 11 TAACGATTAT AGA- -- - 

SEQ 13 AAATGTTGCG CAT - - 

SEQ 15 CAACAACAAG GAC 

SEQ 17 TTATGAGGAT CCTCAGGTAA GTTTTGGTGT TGTTTGAGGG ATGGGGCAAG GGGTTGTCTG TCGTGAACAA CAAAAGGGGC ACGGAACAAA TGCTAACGCC 

SEQ 18 TTATGAGGAT CCTCAG -- 

SEQ 20 AAACTCGGTG GCC- - - 

SEQ 21 AAACTCGGTG GCC - - - 

SEQ 23 TAATCACAAG GAT -- -- 

SEQ 25 - -- 

SEQ 26 

SEQ 28 -- 

SEQ 29 -- - 

SEQ 32 TATCCATCCT AAG - 

SEQ 34 

SEQ 36 - - - - 

SEQ 37 

SEQ 39 

SEQ 41 GAATCATCCT CAG 

SEQ 43 CTATGAGAGT TTT 

SEQ 82 CATCCACAAG ATG- - 

SEQ 84 CCTGGTTCCA TTC 

1701 1711 1721 1731 1741 1751 1761 1771 1781 1791 



8888 8888888888 8888888888 8888888888 88 

SEQ1 CAG AAGATCAAGT CCGGCCCTGC CTTCCAGGTG CCTTTTGCCG TGGCCGTGAA GAAGGCCGTC GGCGAC 

SEQ 2 CAG AAGATCAAGT CCGGCCCTGC CTTCCAGGTG C CTTTT GCCG TGGCCGTGAA GAAGGCCGTC GGCGAC 

SEQ 4 - CAG CATATCCACG CGAAGCCAGG CTTCCAGGCA CCCTTTGCTA TTGCCGTCAA GAACGCCGTC GGGGAC 

SEQ 5 CAG CATATCCACG CGAAGCCAGG CTTCCAGGCA CCCrTTGCTA TTGCCGTCAA GAACGCCGTC GGGGAC 

SEQ 7 - - CAG AAGATCAACC TGCACACGGC CTACCAGACG GACCTGGCCG GGCAGATTCG CCAGGCCATC CGAGCG 

SEQ 9 -- CAG AAGATCAAGT CCGGGCCGGC TTTCCAGGCT CCCTTCGCTG TGGCTATCAA GAAGGCCGTT GGCGAT 

SEQ 11 C AACCACCAAG ATCTGGGATC AGTAAAGAGT TGAGAGAGCC AATCCATGTT CCGTTGTCTC GTGCAATTAA ACAACATGTT GGTGAC 

SEQ 13 T GCAAATCTAG ATATCTATTA AATGACGACA AACAACTACC TTCTCAAGTG CCCTTGGCTC GTAAATTGAA AAGCCACATT AGAAAC 

SEQ 15 CAG AAGATCAACG TCCACACCTA CTACCAGATC GACATGGCCG AGCAGATCCG CGCGGCCGTG CACGAGGCCG 

SEQ 17 ATACAGATGG CCAACGGTCC CAAGCCCGAA AAGTCCGAAC GCACCATGGC CCGCGAGGCC TTCTTCCTCG AGTTCGCCAA GATCATCCGC ACCAAG T 

SEQ 18 ATGG CCAACGGTCC CAAGCCCGAA AAGTCCGAAC GCACCATGGC CCGCGAGGCC TTCTTCCTCG AGTTCGCCAA GATCATCCGC ACCAAG- - -T 

SEQ 20 CAA AAGATCGAGC TCACGCCGTA CTACCAGATC GACCTGGCAG CCAAGATCCG CGAGGCCGTC GGCGAT 

SEQ 21 CAA AAGATCGAGC TCACGCCGTA CTACCAGATC GACCTGGCAG CCAAGATCCG CGAGGCCGTC GGCGAT 

SEQ 23 CAA AGAATTGAGG TGAAGGATTG CTATCAAGTT CCTTTTGCGG AAAAGATTAA GGATCAAGTG AATGGA 

SEQ 25 - - -- 

SEQ 26 -- -- 

SEQ 28 

SEQ 29 - - 

SEQ 32 -TCCGCCATC GCCATCAAGT CCGGTCCTGC TTACCAGGTA GACCTCGCCA AACAGGTAAA GAAGGCTGTT GGCGAT 

SEQ 34 - 

SEQ 36 

SEQ 37 -- 

SEQ 39 - - - - 

SEQ 41 CAG AAAATCAACA TGTTCAACAC C_ - 

SEQ 43 G GTTTTGCGCA CCGCAAGGAG TCCAGCCGCA AGCGGGAGAA CTATTTTATC GAGTTCGCCG AGGTCATCCG CAAGGCCGTC AAGCAC 

SEQ 82 CAA AAGGTCGCTG CTGGTCCCGG TTACCAGGCA CCTCTTGCCA AGGCGATCAA GAAGTCAGTT GGAGAC 

SEQ 84 --- - CAA AAAATCACCG TGGGAGCCGG ATACCAGCTA TTCGGAGCAA AAGCCGTTCG CGATGCTCTG GCCAAA 
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SEQ 1 AAGCT GCTGGTTGCC GCCGTGGGTG CCATCACC- - - AACG GCAAGCAGGC 

SEQ 2 AAGCT GCTGGTTGCC GCCGTGGGTG CCATCACC-- AACG GCAAGCAGGC 

SEQ 4 AAACT CGCAGTGGCA TCAGTGGGTA TGATTGCC - - - -- AGCG CGCA TTTG GC 

SEQ 5 AAACT CGCAGTGGCA TCAGTGGGTA TGATTGCC-- - -AGCG CGCATTTGGC 

SEQ 7 GCTGG CGCGTCGACT CTTGTGGGTG CTGTAGGTCT GATCACCGAT TCGGAACAGG CGAGGGGACT AGTTCAGGGA GCGGACGAGG CGACTGCAGC 

SEQ 9 AAGCT CCTTGTTGCG ACGGTGGGCA CGATCACG AACG GTAAGCAGGC 

SEQ 11 AAGTT ATTGGTCAGT TGCGTTGGTG GGCTTGAA- - - A AAGATCCTGA 

SEQ 13 CGATG TTTGATCGCA TGCAGTGGAG GATTAGAT- - C GAGACATATT 

SEQ 15 GCAAGCAGCT CCTCGTCGGT GCCGTCGGCT TGGTCACC - - TCG GCTGAGATCG CCAAGGAGAC CGTCCAGGAG AAGGAGGATG GCAGAGTCAC 

SEQ 17 TCCCCAAGCT TCCTCTCATG GTCACCGGCG GCTTCCGC- - ACTC GTCAGGGCAT 

SEQ 18 TCCCCAAGCT TCCTCTCATG GTCACCGGCG GCTTCCGC-- ACTC GTCAGGGCAT 

SEQ 2 0 AGGTT GCTCATAGGC GCGGTCGGCA ACATCAAC - - - -ACGG CTGACATTGC 

SEQ 21 AGGTT GCTCATAGGC GCGGTCGGCA ACATCAAC-- ACGG CTGACATTGC 

SEQ 23 AT ACTACTTGGC GCTGTCGGAA TGATCAGG - --GATG GTCTTACGGC 

SEQ 25 -- -- - 

SEQ 26 - - -- 

SEQ 28 -- - - - -- 

SEQ 29 

SEQ 32 AGTGT ACTTGTTTCA GCAGTAGGTG GAATCAAG - A CTGGACATCT 

SEQ 34 -- 

SEQ 36 - 

SEQ 37 - -- 

SEQ 39 

SEQ 41 - 

SEQ 4 3 ATGGT GGTCTACACC ACCGGCGGCT TCAAGACG- - - - - GTGGGCG CCATGGTCGA 

SEQ 82 AAGAT GTTGATCAGC ACTGTTGGTA GCATCAAG-- ATAG GTACCCTTGC 

SEQ 84 - -ATCGAACC CGACGCGTCC AAACGCATGC TCGTCGGGG- --CCGTGG GAATGATGGA 



SEQ 1 GAATCAG ATTCTAG AGGAGCAG - - -- 

SEQ 2 GAATCAG ATTCTAG AGGAGCAG-- - - 

SEQ 4 CAATTCC TTGTTGG AGAAGGAC - - 

SEQ 5 CAATTCC TTGTTGG AGAAGGAC-- -- 

SEQ 7 CGAGGCAATG CTGTCGGGAC CTGAACCC - - - - 

SEQ 9 GAACAAG CTGCTTG AGGAGGAG- - - 

SEQ 11 ATTGCTCAAC AAATATTTAG AAGAAGGA- - -- 

SEQ 13 TAAACTCGAT GAGTTTATTG CTAATGGT- - 

SEQ 15 CATCCAGCGC GAGAACGGCG CCAAGACT- - - 

SEQ 17 GGAGGCC GCTTTGG AATCCGAT- - -- -- --- -- -- 

SEQ 18 GGAGGCC GCTTTGG AATCCGAT 

SEQ 20 GCGCGATGTC GTGGATGAGC AGGGCGCCGA GAAGGTGGCC GAGGCCAAGC AGACGCATGA CACCATCGAG GTCGTGAGCG AATCACATGG CGGCAAGACC 

SEQ 21 GCGCGATGTC GTGGATGAGC AGGGCGCCGA GAAGGTGGCC GAGGCCAAGC AGACGCATGA CACCATCGAG GTCGTGAGCG AATCACATGG CGGCAAGACC 

SEQ 23 GAATGAAATC CTAGAAAGTG GAAAAGCT - 

SEQ 25 - - -- ~ 

SEQ 26 

SEQ 28 

SEQ 29 -- - 

SEQ 32 TGCTGAA GAGGTTT TGCAATCT- - -- - 

SEQ 34 - - 

SEQ 36 - -- 

SEQ 37 - - 

SEQ 39 --- - - - --- -- - -- -- 

SEQ 41 

SEQ 43 CGCGCTGCAG GGCGTCGATG GG 

SEQ 82 GGAGGAG ATCATCG CTGGAGGAGA GGACGATACC 

SEQ 84 AGGTTCC TACGATT CGCCCAAC- - 



SEQ 1 GATATCGACG TTGCGCTGGT TGGCCGTGGG TTCCAGAAGG ATCCCGGTCT GGCCTGGACG TTTGCTCAGC ACCTCGGCGT C 

SEQ 2 GATATCGACG TTGCGCTGGT TGGCCGTGGG TTCCAGAAGG ATCCCGGTCT GGCCTGGACG TTTGCTCAGC ACCTCGGCGT C 

SEQ 4 GGACTGGACC TTGTGCTGGT TGGACGTGGC TTCCAGAAGA ACCCGGGGCT GGTGTGGGCG TGGGCCGACG AGCTGAATGT A 

SEQ 5 GGACTGGACC TTGTGCTGGT TGGACGTGGC TTCCAGAAGA ACCCGGGGCT GGTGTGGGCG TGGGCCGACG AGCTGAATGT A 

SEQ 7 AAGGCGGATG CCATTCTGAT AGCCCGTCAG TTCCTGCGCG AGCCAGAATG GGTGTTTTCC ACGGCGAGAA AGTTGGGCGT G - 

SEQ 9 GGATTGGATG TTGCGCTTGT GGGACGTGGT TTCCAGAAGG ATCCCGGTCT GGCGTGGACT TTCGCGCAGC ATCTTGATGT T 

SEQ 11 ACATTTGATC TTGCTTTGAT CGGTAGAGGA TTTTTAAGAA ATCCAGGTTT GGTATGGGAG TTTGCCGATA AACTTGGTGT T 

SEQ 13 GACTTTGATA TAG CATTGAT AGGTAAAGGA TTTCTCAAAA ACACTGGATT GATCAGCCGT ATTGCTGACC AATTGCAAGC A 

SEQ 15 CGTGCCGATA TGGTCCTTGT TGCCAGGCAG TTCTTGAAGG AGCCCGAGTT CGTCCTCACT GTCGCCGACG AGTTGGGTGT T 

SEQ 17 GATTGCGACA TGATCGGTAT CGGACGCCCG GCCATCATCA ACCCTTCGCT TCCCGCCAAC TTGATCCTCA ACCCGGAGGT G 

SEQ 18 GATTGCGACA TGATCGGTAT CGGACGCCCG GCCATCATCA ACCCTTCGCT TCCCGCCAAC TTGATCCTCA ACCCGGAGGT G 

SEQ 20 AAGGCGGATG TGGTCCTCAT TGCTCGCCAG TTCCTGCGCG AGCCTGAGTT TGTGCTGAGG ACGGCGCATA ACCTTGGGGT C 

SEQ 21 AAGGCGGATC TGGTCCTCAT TGCTCGCCAG TTCCTGCGCG AGCCTGAGTT TGTGCTGAGG ACGGCGCATA ACCTTGGGGT C 

SEQ 2 3 - GATG TTACTTTTGT CGCAAGGGAG TTCTTAAGGA ACCCGTCGTT GGTGCTAGAC AGCGCGAACC AGTTGGGTGA A 

SEQ 25 -- 

SEQ 26 

SEQ 28 - 

SEQ 29 

SEQ 32 GGTATCGACA TTGTGAGGGC TGGACGTTGG TTCCAACAGA ATCCTGGTCT GGTTCGAGCT TTTGCTAACG AGCTTGGCGT G - 

SEQ 34 - - -- 

SEQ 36 -- --- - 

SEQ 37 - -- - 

SEQ 39 -- - - 

SEQ 41 - 

SEQ 43 ATAGGCAT CGGGCGCGCA GCCGGTTCGG AGCCGGACCT CGCCAAGGAC ATCATCGCGG GCAAGGTGTC CAGCATTATC AAATACGCCA 

SEQ 82 CCCTTGGATC TTGTGGCTTC AGGCCGTCTG TTCCAGAAGA ACACTGGACT TGTTTGGTCA TGGGCTGACG ATCTGAACAC T 

SEQ 84 GGCCAAGACC GCAGCCAGAT TGGCAAGTTG GCCGAGCAGT CGATTCAGAG CGGAGAGTGT GATGCGGTAC TGTTGGCACG T GGATTGA 



App No.: NYA Expr Mail #: EV678185082US ' 
Docket No.: HO-P03371US0 

Inventor: Sandra E. Lavens et al. i 

Title: 2031 OXIDOREDUCTASE i 



J 



12/17 



SEQ 1 -- - - GAAA TCTCCATGGC CAACCAGATC CGCTGGGGCT TCACCCGGCG TGGAGGCACC CCGTACATTG ATCCTTCGGT 

SEQ 2 GAAA TCTCCATGGC CAACCAGATC CGCTGGGGCT TCACCCGGCG TGGAGGCACC CCGTACATTG ATCCTTCGGT 

SEQ 4 GAGA TCTCCATGGC TAATCAGATC CGATGGGGTT TCTCGCGGCG CGGTGCTGGT CCTTACCTCA GGAAGAAACT 

SEQ 5 GAGA TCTCCATGGC TAATCAGATC CGATGGGGTT TCTCGCGGCG CGGTGCTGGT CCTTACCTCA GGAAGAAACT 

SEQ 7 CCGG TGACTGTCCC GGTGCAGTTT GGCAGGGCCA TTTAG 

SEQ 9 GAGA TTGCGATGGC GAGTCAGATT CGGTGGGGAT TCACAAGGCG CGGGGGCACG CCTTATATCG ACCCCAAAGC 

SEQ 11 AGAC TCCACCAGGC CTTGCAGTTA GGTTGGGGTT TCTGGCCCAA CAAACAACAA ATTGTTGATT TGATTGAAAG 

SEQ 13 CAAT TCAGAACAGC ACCTCAATAT AAGTTGGCCT TATCATAA- - 

SEQ 15 --- - GATG TCAAGGCCCC TGTTCAGTAC CTCCGTGGTC CTCTTAGCAG CAGGCCCAAG AAGTTGACCA CTGTTCCTTA 

SEQ 17 - CCGG ATGCGGATGC CCGCTTGTTC GACAAGAAGA GGGCTGAGCC GCACTGGATC GTTGAGAAGT TGGGCATGAA 

SEQ 18 - CCGG ATGCGGATGC CCGCTTGTTC GACAAGAAGA GGGCTGAGCC GCACTGGATC GTTGAGAAGT TGGGCATGAA 

SEQ 20 AATG TGCAGTGGCC TCACCAATAC CACAGAGCAG TGTGGCGCAA GGGTGCAAGG ATTTGA 

SEQ 21 - - AATG TGCAGTGGCC TCACCAATAC CACAGAGCAG TGTGGCGCAA GGGTGCAAGG ATTTGA 

SEQ 23 - AATG TTGCATGGCC AGTTCAGTAT GACTATGCAG TTAAGGGACA CAGAAAGTTA CGTTGA -- 

SEQ 25 - - -- 

SEQ 26 - 

SEQ 28 - -- 

SEQ 29 - - 

SEQ 32 GAGG TCAAGATGGC GAACCAGATT GATTGGAGCT TCAAGGGACG TGGAAAGAAA GTGAACAAGA GTTCTTTATA 

SEQ 34 

SEQ 36 - 

SEQ 37 - 

SEQ 39 

SEQ 41 -- - 

SEQ 4 3 TGGGGGAGGA CGAGTTTGTG CTGCAGTTGA CTGCCTGCTC GGCGCAAATA AGGCTGATGG CCAAGGGCGA GGAGCCGTTT GAC 

SEQ 82 -- TCTA TCCAGATCGC TCATCAGATC GCATGGGGTT TCGGTGGCAG AGCTAAGAAG AACGCTCCCA AGCTTGTCTT 

SEQ 84 TGTCCTACCC AAGCTGGACC GAGGATGCTA GTGTAGCGCT GATGGGTACC AGGGCAGCTG GCAACCCGCA GTACCATCGC GTTCACGTGG CTAAGAAGTG 



SEQ 1 GTACAAGCAG TCTATTTTCG ATGTATAG- - 

SEQ 2 GTACAAGCAG TCTATTTTCG ATGTATAG-- 

SEQ 4 CGAGAAGATA TAA -- - - 

SEQ 5 CGAGAAGATA TAA 

SEQ 7 

SEQ 9 TTATAAGGAG AGCATCTTTG AGTAA - - 

SEQ 11 AACATCTAAA TTAGAAGTAA ATTAG 

SEQ 13 -- 

SEQ 15 A 

SEQ 17 GTCCATTGTT GGTGCTGGTG TTGAGGTGGT ACGTCACGTT CCAACCCCAT TTGCTTCATT GTGTTTCCGA GTATGTCATG CTGACTTGGT TCTTTTCTAG 

SEQ 18 GTCCATTGTT GGTGCTGGTG TTGAGGTG-- 

SEQ 20 

SEQ 21 - - 

SEQ 23 

SEQ 25 

SEQ 26 - - 

SEQ 28 -- 

SEQ 29 

SEQ 32 G - 

SEQ 34 - - 

SEQ 36 

SEQ 37 - - 

SEQ 39 -- - - 

SEQ 41 - - 

SEQ 43 -- - ATCTC AAACGCCGAC GAGGTGGCGC GGGTGACGCA GTTGATGGCG 

SEQ 82 A 

SEQ 84 A 



SEQ 1 AGTATAGATA GAGTTGAAGA TGATACCTCA TAGACGATCA ATGGACCCTT GCATATTATT TCTCGTCTCC TGCGTATGTT CAAGGTATTC ACAGTAGCTG 

SEQ 2 AGTATAGATA GAGTTGAAGA TGATACCTCA TAGACGATCA ATGGACCCTT GCATATTATT T 

SEQ 4 -- 

SEQ 5 

SEQ 7 - - 

SEQ 9 

SEQ 11 - - - - 

SEQ 13 

SEQ 15 

SEQ 17 ACGTGGTATG TGAGCGAGCT CAAGAAGCTG GCCAAGTTTT AG - 

SEQ 18 ACGTGGTATG TGAGCGAGCT CAAGAAGCTG GCCAAGTTTT AG - 

SEQ 20 -- 

SEQ 21 - -- 

SEQ 23 - - - 

SEQ 25 -- 

SEQ 26 -- 

SEQ 28 - - - 

SEQ 29 - -~ 

SEQ 32 -- 

SEQ 34 -- 

SEQ 36 - - 

SEQ 37 -- 

SEQ 39 -- 

SEQ 41 

SEQ 43 GAGGGCAAGG TG 

SEQ 82 

SEQ 84 
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2401 2411 2421 2431 2441 2451 2461 2471 2481 2491 

SEQ 1 CGTCCTCTTA AGTTTCTCCG TCATTCGTTC TATTCTACTC CAATCGCAAC GCATGGCGAC CACGGATCGA GTCGAATTTC TCCGTCGTTC GTATCTGATC 

SEQ 2 - 

SEQ 4 

SEQ 5 --- - 

SEQ 7 - 

SEQ 9 -- 

SEQ 11 - 

SEQ 13 - - - - 

SEQ 15 - 

SEQ 17 - - -- 

SEQ 18 

SEQ 20 - 

SEQ 21 - - - -- - 

SEQ 23 - 

SEQ 25 -- 

SEQ 26 - -- 

SEQ 28 

SEQ 29 -- - 

SEQ 32 

SEQ 34 

SEQ 36 - - 

SEQ 37 -- 

SEQ 39 

SEQ 41 - 

SEQ 43 -- 

SEQ 82 - 

SEQ 84 - - 



2501 2511 2521 2531 2541 2551 2561 2571 2581 25 

SEQ 1 AATATAAAAA GCGGGGAATG GCTTGACCCC GCGCAGAATG TCGATCTCTT CGCAAACTCT CGGTGTATAG GACGCTCAGC AACGATCAAG G 

SEQ 2 - 

SEQ 4 - 

SEQ 5 -- - - - 

SEQ 7 - - 

SEQ 9 - - 

SEQ 11 - -- - 

SEQ 13 - 

SEQ 15 -- - -- - - - -- - 

SEQ 17 - - - - 

SEQ 18 

SEQ 20 --- - 

SEQ 21 - 

SEQ 23 - - - 

SEQ 25 - -- - 

SEQ 26 -- - 

SEQ 28 

SEQ 29 

SEQ 32 

SEQ 34 - - 

SEQ 36 - - 

SEQ 37 - - - 

SEQ 39 

SEQ 41 - 

SEQ 43 --- -- - - 

SEQ 82 - - - 

SEQ 84 



Figure 2. A multiple alignments of the 2031 OR nucleic acid 
sequence from A. fumigatus (SEQ 1,2) along with related 2031 ORs 
from other fungi and bacteria (see also Example 4). Regions 1-11, 
marked with * or #, refer to regions conserved at the amino acid 
level between Ors but not OYEs . 

Fungal 2031 ORs are given by SEQ ID No.: SEQ ID Nos . 1, 2, 4, 5, 
and 7, A. fumigatus; SEQ ID No. 9, A.nidulans; SEQ ID Nos. 11 and 
13, C. albicans; SEQ ID Nos. 15, 17 and 18, N. crassa; SEQ ID 
Nos. 20, 21 and 43, M. grisea; SEQ ID No. 23 (NP_595868), S. 
pombe; SEQ ID Nos. 25 and 26, C\ trifolii; SEQ ID Nos. 28, 29, 
31, 3 2 and 34, F. sporotrichioides ; SEQ ID Nos. 36, 3 7 and 82, F. 
graminearum; SEQ ID Nos. 39 and 41, M. graminicola; SEQ ID No. 
84, U. maydis. 
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Figure 3. Recombinant 2031 OR. (A) Time course of recombinant 2031 OR induction over 24 
hours after the addition of IPTG (samples without IPTG are also shown). The gel was stained 
with coomassie; A prominent band of the correct molecular weight (marked with an arrow) is 
seen. (B) Coomassie stained gel showing purified recombinant 2031. 
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Figure 4. Phylogenetic tree showing relationships between A. fumigatus 2031 OR and similar 
proteins. This demonstrates a 2031 OR clade, which can be distinguished from the OYE 
proteins. 
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Figure 5: NADPH dehydrogenase activity of recombinant 2031 OR with cyclohexenone (CHX), 
N-ethylmaleimide (NEM), menadione (MEN) or duroquinone (DQ) as substrates. Final 
concentrations in the assay were as follows: 500 jaM substrate, 120 jliM NADPH, 1 jig/200 jaL 
2031 OR. 
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Figure 6: Inhibition of 2031 OR function by two inhibitors (shown in A and B) identified by high- 
throughput screening. 
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